Project

Status:

Active/Ongoing

Linked to group(s)/challenge(s):

Covid19 Diagnostic and Detection

CoughCheck App

Development of AI audio app to compare cough of Coronavirus infected versus Normal cough.

Welcome to our project

Our Goal

Our goal is to release an AI Mobile App to help in the rapid-detection of COVID-19, and make a huge impact to relief the pain of millions of people all over the World, currently fighting or suffering the impact of this pandemic.

We expect to help reducing the time required for diagnosis of COVID-19 by delivering a smartphone App which can discriminate between coughs of a potentially infected patient, and a normal cough. Our rationale behind the project is:

The emergence of automated cough audio analysis and research (a curated list of relevant articles is available below).
The advances in Deep Learning in the latest years, which enable unprecedented pattern inference by successive transformation of raw data.
The emerging trend on Smartphone AI applications, which could be used on-site by anyone to help the lives of millions worldwide.

Joining us

Please join us at OpenCovid Slack proj-dev-ai-cough-detection channel, so we can know more about you and how could you help us in the project. Any kind of suggestion could be useful.

Elevator pitch / Abstract

The CoughCheck Application goal is to take a humble approach to cough data collection, using all the help we can get from researchers and specialists all over the World. Our first goal is to deploy a version of the App to collect cough data. The process of data collection will affect the pipeline for later interpretation of the predictions. Our second goal is to deploy the Application with a predictive model backed with a scientifically approved accuracy, to the best of the community knowledge, to help in the early detection of COVID-19 cases.

The App does NOT intend to replace any medical or laboratory diagnosis. Additionally, the App will not include any kind of monetization, and will enforce users privacy by using encryption standards and additional techniques to avoiding audio fingerprinting.

How to contribute

Software Developers can check the contribution process and guidelines.
Researchers and advisors can join our general Slack Channel and Trello Board to check current challenges.
Machine Learning experts and practicioners can use the Slack Channel for AI Audio Analysis named #prgm-aiml-cough-app and JOGL partnership PaperSpace to run experiments.
Privacy researchers and consultants can join the #privacy_cough_check channel in Slack.

Problem Statement

Currently there is not enough cough audio data to train a Machine Learning model to help discriminate between different types of coughs. Recent studies in respiratory research addressing cough differentiation already noted the lack of cough data, and suggested through several experiments the use of Deep Learning methodology to detect between different types of coughs.

Additionally, without a systematic data collection method, the data could be gathered from different Organizations and Companies, without clear or fair Terms of Use policies, and this data could be potentially disseminated through different formats and third-parties, promoting the emergence of AI black-box models, or unexplainable Machine Learning which didnt't considered bias and fairness.

Objectives & Methodology

UI/UX Design:
Patterns: Mobile Patterns
Prototyping: Supernova.io or sketch-to-react-native
Project Management: ProjectLibre
Software development: React Native (TypeScript) / Expo
Repository: GitHub
IDE: Visual Studio Code
Debugging: Reactotron, React Native Debugger
Testing & Automation: Storybook, Enzym.
Local Database: Firebase, SQLite, Realm, PouchDB, CouchBase.
Performance Monitoring: Sentry, New relic, Fabric Answers, GetPleak, Bugsnag.
Continuous Integration: Travis CI, Circle CI.
Authentication/Authorization: OAuth2 with expo-auth-app.
Payments: The App will NOT include any kind of payment processor.
i18n/i10n : Pontoon or Zanata (TBD)
QA: HoundCI
Software Analysis: Moose
Audio processing and Machine Learning:
Convert the audio to the Mel-Frequency Spectrogram (MFCCs) or Gammatone-frequency cepstral coefficients (GFCCs) and use a computer vision model on it (resent, vgg, etc).
pyAudioProcessing
pyAudioAnalysis
Privacy preservation:
Computation on-device so no audio leaves the device.
Encrypted remote user vaults using Open Humans Foundation API and datastore.
Donations:
OpenCollective (platform where communities can collect and disburse money transparently, to sustain and grow their projects)
Liberapay

State of the art

Recently (Sept 2, 2019) a pubished Thesis called "Análisis de señales de tos para detección temprana de enfermedades respiratorias" (Analysis of cough signals for early detection of respiratory diseases, https://uvadoc.uva.es/handle/10324/38797) shows how different sets of coughs were tested: Acute cough vs EPOC, Acute cough vs Lung cancer, Acute cough vs Cronic cough (not EPOC), EPOC vs Lung cancer. The author tried multiple fold cross-validations sets.- (section 7.2) and he suggested more data is needed : “In this case, we have obtained 14 out of 18 correct diagnoses as can be seen in Table 7.8, which offers a success of 77.78%. But we do not have enough patients to obtain the statistics of how the neural network is behaving, since in all the folds there are not test patients of all kinds” .

“This reaffirms us in the hypothesis that with more data the neural network would offer better results.” (page 93, translation is mine).

Other research findings in latest years indicate that cough discrimination technology already could be used as a high-level diagnostic aid, with prototypes using Deep Neural Networks for background noise sensing, and addressing heterogeneous sensing quality . An algorithm for detecting coughs from the audio stream of a mobile phone was already reported, with an average true positive rate of 92% and false positive rate of 0.5%, along with a feed forward neural network for pneumonia diagnosis with sensitivity 90%, specificity 98.7% and accuracy 97. Specifically for cough event detection, a method and dataset consisting of more than 1000 cough events and a significant number of noises was published in 2016, with cough detection F-score > 91 % with individually trained models and > 81 % for subject independent training.

These findings could bring hope that recent advances in Deep Learning could enable unprecedented pattern inference by successive transformation of raw data.

Current Literature

Have a look at the following articles for a research background and potential of the technology:

A prospective multicentre study testing the diagnostic accuracy of an automated cough sound centred analytic system for the identification of common respiratory disorders in children Conclusion: The results indicate that this technology has a role as a high-level diagnostic aid in the assessment of common childhood respiratory disorders.
Accurate and Privacy Preserving Cough Sensing using a Low-Cost Microphone Conclusion: Algorithm for detecting coughs from the audio stream of a mobile phone. Average true positive rate of 92% and false positive rate of 0.5%.
An Integrated Computerized Cough Analysis by Using Wavelet for Pneumonia Diagnosis Conclusion: Feed forward neural network classifier to increase the classification performance with having sensitivity 90%, specificity 98.7% and accuracy 97%.
DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning Conclusion: Mobile audio sensing framework built from coupled Deep Neural Networks (DNNs).
MobiCough: Real-Time Cough Detection and Monitoring Using Low-Cost Mobile Devices Conclusion: Dataset consisting of more than 1000 cough events and a significant number of noises. Cough detection F-score > 91 % with individually trained models and > 81 % for subject independent training
Check our Trello board for more related research and findings.

Progress report

Already setup a GitHub repository where development is happening right now.
Setup a Trello board for coordination of different tasks.
Partnered with Open Humans Foundation to manage the data storage.
Project accepted in OpenCollective crowdfunding platform to receive donations in a transparent way.
Contacted and joined Machine-Learning experts, Privacy researchers, React/TypeScript developers and UI/UxD specialists in the Slack channel.
Collected literature relevant to both Deep Learning and cough differentiation.
Promoted an additional Slack channel for Audio Machine Learning discussion.
Registered domain names and social accounts.
Built assets such as logos, sketches, resulted from several iterations with professional designers.
Built a reproducible React Native build with most of OAuth2 code ready, available currently as developement mode in the Expo environment.
Started the questionnaire building in Google Doc to get more iterations and better insights of causal/non-causal relationships form medical doctors and specialists

Stakeholders

We are all impacted by COVID-19 and everyone that is actively helping in this project is a stakeholder. We are working together to develop this technology which could potentially benefit each one of us at the individual level which is a great motivator but to also know that it could have an unimaginable positive impact globally is something we all think about every once in a while putting our time and effort into making this a reality.

Impact strategy

To deploy an AI enabled application with a good predictive model to discriminate between diferent types of coughs, we need to start to collect cough sounds (short-term goal) from all over the World as soon as possible.

Collecting coughs will allow to train a Machine Learning model in a systematic (scientific) way, which help us to avoid data fallacies such as sampling and survivorship bias, false causalities and data dredging.

Our data collection process is open to be reviewed by any researcher or specialist to help in proposing variables or alternative methods to better estimate the predictive accuracy, and to obtain a representative data set to lower the error rate using the same collection standard.

The issue of avoiding an overfitting model to distinguish between non-covid and covid-19 coughs is extremely challenging, we know this and that’s the reason why we are calling for help to join efforts all over the World. We also address specially concerns respect to the data collection privacy, and doing our best to promote de-identification and avoid cough fingerprinting.

Sustainability and scalability

Being an Open Source Community means collaborators contribute at their own pace producing best-effort results. The community has less than one month of experience working together but already self-organized in groups to address very complex areas of building an AI mobile-based project, with a clear focus on reproducibility, interpretability and transparency through open Standards (such as OAuth2) and Foundations (such as JOGL and Open Humans).

A rough estimate of deployment for the CoughCheck App for our first-goal (Audio Collection) could be premature at this point, and we lack of experienced human resources to make proper ballpark estimates considering the team skill sets and the reality each participant could leave or join the project at any time - for any kind of reason.

Funding

Currently this project is not receiving any funding or donations.

We would need the necessary to cover:

Domain names : $126.82
Developer accounts needed for submitting to Apple Store (99$) and Google Play (25$)
Professional Zoom account: $14.99

Total funding needed currently: U$D 265,81 (for one-developer account in Apple Store and Google Play)

Additional information

Short Name: #coughcheck
Created on: March 18, 2020
Last update: July 12, 2021
Grant information: Received €2,420.00€ from the OpenCOVID19 Grant Round 1 on 04/07/2020

Keywords

Machine learning

Devop

Artificial intelligence

Tensorflow

Audio technology

+ 14

Associated SDGs

Good Health and Well-being

Partnership for the Goals