Understanding the temporal dynamics of COVID-19 patient phenotypes is necessary to derive finegrained resolution of pathophysiology. Here we use state-of-the-art deep neural networks over an institution-wide machine intelligence platform for the augmented curation of 15.8 million clinical notes from 30,494 patients subjected to COVID-19 PCR diagnostic testing. By contrasting the Electronic Health Record (EHR)-derived clinical phenotypes of COVID-19-positive (COVIDpos, n=635) versus COVID-19-negative (COVIDneg, n=29,859) patients over each day of the week preceding the PCR testing date, we identify anosmia/dysgeusia (37.4-fold), myalgia/arthralgia (2.6-fold), diarrhea (2.2-fold), fever/chills (2.1-fold), respiratory difficulty (1.9-fold), and cough (1.8-fold) as significantly amplified in COVIDpos over COVIDneg patients. The specific combination of cough and diarrhea has a 3.2-fold amplification in COVIDpos patients during the week prior to PCR testing, and along with anosmia/dysgeusia, constitutes the earliest EHR-derived signature of COVID-19 (4-7 days prior to typical PCR testing date). This study introduces an Augmented Intelligence platform for the realtime synthesis of institutional knowledge captured in EHRs. The platform holds tremendous potential for scaling up curation throughput, with minimal need for retraining underlying neural networks, thus promising EHR-powered early diagnosis for a broad spectrum of diseases.
Temporal inference from laboratory testing results and triangulation with clinical outcomes extracted from unstructured EHR provider notes is integral to advancing precision medicine. Here, we studied 246 SARS-CoV-2 PCR-positive (COVIDpos)patients and propensity-matched 2,460 SARS-CoV-2 PCR-negative (COVIDneg) patients subjected to around 700,000 lab tests cumulatively across 194 assays. Compared to COVIDneg patients at the time of diagnostic testing, COVIDpos patients tended to have higher plasma fibrinogen levels and lower platelet counts. However, as the infection evolves, COVIDpos patients distinctively show declining fibrinogen, increasing platelet counts, and lower white blood cell counts. Augmented curation of EHRs suggests that only a minority of COVIDpos patients develop thromboembolism, and rarely, disseminated intravascular coagulopathy (DIC), with patients generally not displaying platelet reductions typical of consumptive coagulopathies. These temporal trends provide fine-grained resolution into COVID-19 associated coagulopathy (CAC) and set the stage for personalizing thromboprophylaxis.
Case reports of patients infected with COVID-19 and influenza virus (“flurona”) have raised questions around the prevalence and severity of co-infection. Using data from HHS Protect Public Data Hub, NCBI Virus, and CDC FluView, we analyzed trends in SARS-CoV-2 and influenza hospitalized co-infection cases and strain prevalences. We also characterized co-infection cases across the Mayo Clinic Enterprise from January 2020 to April 2022. We compared expected and observed co-infection case counts across different waves of the pandemic and assessed symptoms and outcomes of co-infection and COVID-19 mono-infection cases after propensity score matching on clinically-relevant baseline characteristics. From both Mayo Clinic and nationwide datasets, the observed co-infection rate for SARS-CoV-2 and influenza has been higher during the Omicron era (December 14, 2021 to April 2, 2022) compared to previous waves, but no higher than expected assuming infection rates are independent. At Mayo Clinic, only 120 co-infection cases were observed among 197,364 SARS-CoV-2 cases. Co-infected patients were relatively young (mean age: 26.7 years) and had fewer serious comorbidities compared to mono-infected patients. While there were no significant differences in 30-day hospitalization, ICU admission, or mortality rates between co-infected and matched COVID-19 mono-infection cases, co-infection cases reported higher rates of symptoms including congestion, cough, fever/chills, headache, myalgia/arthralgia, pharyngitis, and rhinitis. While most co-infection cases observed at Mayo Clinic occurred among relatively healthy individuals, further observation is needed to assess outcomes among subpopulations with risk factors for severe COVID-19 such as older age, obesity, and immunocompromised status. Significance Statement Reports of COVID-19 and influenza co-infections (“flurona”) have raised concern in recent months as both COVID-19 and influenza cases have increased to significant levels in the US. Here, we analyze trends in co-infection cases over the course of the pandemic to show that these co-infection cases are expected given the background prevalences of COVID-19 and influenza independently. In addition, from an initial analysis of these co-infection cases which have been observed at the Mayo Clinic, we find that these co-infection cases are extremely rare, have mostly been observed in relatively young, healthy patients, and do not have an increased risk of hospitalization, ICU admission, or death while they do have more emblematic viral symptoms.
The natural language portions of an electronic health record (EHR) communicate critical information about disease and treatment progression. However, the presence of personally identifying information in this data constrains its broad reuse. In the United States, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) provides a de-identification standard for the removal of protected health information (PHI). Despite continuous improvements in methods for the automated detection of PHI over time, the residual identifiers in clinical notes continue to pose significant challenges - often requiring manual validation and correction that is not scalable to generate the amount of data needed for modern machine learning tools. In this paper, we describe an automated de-identification system that employs an ensemble architecture, incorporating attention-based deep learning models and rule based methods, supported by heuristics for detecting PHI in EHR data. Upon detection of PHI, the system transforms these detected identifiers into plausible, though fictional, surrogates to further obfuscate any leaked identifier. We evaluated the system with a publicly available dataset of 515 notes from the I2B2 2014 de-identification challenge and a dataset of 10,000 notes from the Mayo Clinic. We compared our approach with other existing tools considered best-in-class. The results indicated a recall of 0.992 and 0.994 and a precision of 0.979 and 0.967 on the I2B2 and the Mayo Clinic data, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.