Abstract:
Background
Detection of the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) relies on real-time-reverse-transcriptase polymerase chain reaction (RT-PCR) on nasopharyngeal swabs. The false-negative rate of RT-PCR can be high when viral burden and infection is localized distally in the lower airways and lung parenchyma. An alternate safe, simple and accessible method for sampling the lower airways is needed to aid in the early and rapid diagnosis of COVID-19 pneumonia.
Methods
In a prospective unblinded observational study, patients admitted with a positive RT-PCR and symptoms of SARS-CoV-2 infection were enrolled from three hospitals in Ontario, Canada. Healthy individuals or hospitalized patients with negative RT-PCR and without respiratory symptoms were enrolled into the control group. Breath samples were collected and analyzed by laser absorption spectroscopy (LAS) for volatile organic compounds (VOC) and classified by machine learning (ML) approaches to identify unique LAS-spectra patterns (breathprints) for SARS-CoV-2. 
Findings
Of the 135 patients enrolled, 115 patients provided analyzable breath samples. Using LAS-breathprints to train ML classifier models resulted in an accuracy of 72·2-81·7% in differentiating between SARS-CoV2 positive and negative groups. The performance was consistent across subgroups of different age, sex, BMI, SARS-CoV-2 variants, time of disease onset and oxygen requirement. The overall performance was higher than compared to VOC-trained classifier model, which had an accuracy of 63-74·7%.
Conclusion
This study demonstrates that a ML-based breathprint model using LAS analysis of exhaled breath may be a valuable non-invasive method for studying the lower airways and detecting SARS-CoV-2 and other respiratory pathogens. The technology and the ML approach can be easily deployed in any settting with minimal training. This will greatly improve access and scalability to meet surge capacity; allow early and rapid detection to inform therapy; and offers great versatility in developing new classifier models quickly for future outbreaks.