The COVID-19 pandemic has affected the world unevenly; while industrial economies have been able to produce the tests necessary to track the spread of the virus and mostly avoided complete lockdowns, developing countries have faced issues with testing capacity. In this paper, we explore the usage of deep learning models as a ubiquitous, low-cost, pre-testing method for detecting COVID-19 from audio recordings of breathing or coughing taken with mobile devices or via the web. We adapt an ensemble of Convolutional Neural Networks that utilise raw breathing and coughing audio and spectrograms to classify if a speaker is infected with COVID-19 or not. The different models are obtained via automatic hyperparameter tuning using Bayesian Optimisation combined with HyperBand. The proposed method outperforms a traditional baseline approach by a large margin. Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9 %, or an Area Under ROC Curve (AUC) of 80.7 % by ensembling neural networks, considering the best test set result across breathing and coughing in a strictly subject independent manner. In isolation, breathing sounds thereby appear slightly better suited than coughing ones (76.1 % vs 73.7 % UAR).
Several machine learning-based COVID-19 classifiers exploiting vocal biomarkers of COVID-19 has been proposed recently as digital mass testing methods. Although these classifiers have shown strong performances on the datasets on which they are trained, their methodological adaptation to new datasets with different modalities has not been explored. We report on cross-running the modified version of recent COVID-19 Identification ResNet (CIdeR) on the two Interspeech 2021 COVID-19 diagnosis from cough and speech audio challenges: ComParE and DiCOVA. CIdeR is an end-to-end deep learning neural network originally designed to classify whether an individual is COVID-19-positive or COVID-19-negative based on coughing and breathing audio recordings from a published crowdsourced dataset. In the current study, we demonstrate the potential of CIdeR at binary COVID-19 diagnosis from both the COVID-19 Cough and Speech Sub-Challenges of INTERSPEECH 2021, ComParE and DiCOVA. CIdeR achieves significant improvements over several baselines. We also present the results of the cross dataset experiments with CIdeR that show the limitations of using the current COVID-19 datasets jointly to build a collective COVID-19 classifier.
We report on cross-running the recent COVID-19 Identification ResNet (CIdeR) on the two Interspeech 2021 COVID-19 diagnosis from cough and speech audio challenges: ComParE and DiCOVA. CIdeR is an end-to-end deep learning neural network originally designed to classify whether an individual is COVIDpositive or COVID-negative based on coughing and breathing audio recordings from a published crowdsourced dataset. In the current study, we demonstrate the potential of CIdeR at binary COVID-19 diagnosis from both the COVID-19 Cough and Speech Sub-Challenges of INTERSPEECH 2021, Com-ParE and DiCOVA. CIdeR achieves significant improvements over several baselines.
The COVID-19 pandemic has affected the world unevenly; while industrial economies have been able to produce the tests necessary to track the spread of the virus and mostly avoided complete lockdowns, developing countries have faced issues with testing capacity. In this paper, we explore the usage of deep learning models as a ubiquitous, low-cost, pre-testing method for detecting COVID-19 from audio recordings of breathing or coughing taken with mobile devices or via the web. We adapt an ensemble of Convolutional Neural Networks that utilise raw breathing and coughing audio and spectrograms to classify if a speaker is infected with COVID-19 or not. The different models are obtained via automatic hyperparameter tuning using Bayesian Optimisation combined with HyperBand. The proposed method outperforms a traditional baseline approach by a large margin. Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9 %, or an Area Under ROC Curve (AUC) of 80.7 % by ensembling neural networks, considering the best test set result across breathing and coughing in a strictly subject independent manner. In isolation, breathing sounds thereby appear slightly better suited than coughing ones (76.1 % vs 73.7 % UAR).
Our main contributions are as follows:• We demonstrate the first attempt to diagnose COVID-19 using end-to-end deep learning from a crowd-sourced dataset of audio samples, achieving ROC-AUC of 0.846• Our model, the COVID-19 Identification ResNet, (CIdeR), has potential for rapid scalability, minimal cost, and improving performance as more data becomes available. This could enable regular COVID-19 testing at a population scale• We introduce a novel modelling strategy using a custom deep neural network to diagnose COVID-19 from a joint breath and cough representation• We release our four stratified folds for cross parameter optimisation and validation on a standard public corpus and details on the models for reproducibility and future reference
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.