IMPORTANCE: Identifying potential Covid-19 patients in the general population is a huge challenge at the moment. Given the low availability of infected Covid-19 patients clinical data, it is challenging to understand and comprehend similar and complex patterns in these symptomatic patients. Laboratory testing for Covid19 antigen with RT-PCR | (Reverse Transcriptase) is not possible or economical for whole populations. OBJECTIVE: To develop a Covid risk stratifier model that classifies people into different risk cohorts, based on their symptoms and validate the same. DESIGN: Analysis of Covid cases across Wuhan and New York were done to identify the course of these cases prior to being symptomatic and being hospitalised for the infection. A dataset based on these statistics were generated and was then fed into an unsupervised learning algorithm to reveal patterns and identify similar groups of people in the population. Each of these cohorts were then classified and identified into three risk levels that were validated against the real world cases and studies. SETTING: The study is based on general population. PARTICIPANTS: The adult population were considered for the analysis, development and validation of the model RESULTS: Of 1 million observations generated, 20% of them exhibited Covid symptoms and patterns, and 80% of them belonged to the asymptomatic and non-infected group of people. Upon clustering, three clinically obvious clusters were obtained, out of which the Cluster A had 20% of the symptomatic cases that were classified into one cohort, the other two cohorts, Cluster B had people with no symptoms but with high number of comorbidities and Cluster C had people with few leading indicators for the infection with few comorbidities. This was then validated against 300 participants whose data we collected as a part of a research study through our Covid-research tool and about 92% of them were classified correctly. CONCLUSION: A model was developed and validated that classifies people into Covid risk categories based on their symptoms. This can be used to monitor and track cases that rapidly transition into being symptomatic which eventually get tested positive for the infection in order to initiate early medical interventions. KEYWORDS: Covid-19, Synthetic Data, Patient Clustering, Unsupervised Learning, Risk Classification
UNSTRUCTURED ABSTRACT: IMPORTANCE: Identifying potential Covid-19 patients in the general population is a huge challenge at the moment. Given the low availability of infected Covid-19 patients clinical data, it is challenging to understand and comprehend similar and complex patterns in these symptomatic patients. Laboratory testing for Covid19 antigen with RT-PCR | (Reverse Transcriptase) is not possible or economical for whole populations. OBJECTIVE: To develop a Covid risk stratifier model that classifies people into different risk cohorts, based on their symptoms and validate the same. DESIGN: Analysis of Covid cases across Wuhan and New York were done to identify the course of these cases prior to being symptomatic and being hospitalised for the infection. A dataset based on these statistics were generated and was then fed into an unsupervised learning algorithm to reveal patterns and identify similar groups of people in the population. Each of these cohorts were then classified and identified into three risk levels that were validated against the real world cases and studies. SETTING: The study is based on general population. PARTICIPANTS: The adult population were considered for the analysis, development and validation of the model RESULTS: Of 1 million observations generated, 20% of them exhibited Covid symptoms and patterns, and 80% of them belonged to the asymptomatic and non-infected group of people. Upon clustering, three clinically obvious clusters were obtained, out of which the Cluster A had 20% of the symptomatic cases that were classified into one cohort, the other two cohorts, Cluster B had people with no symptoms but with high number of comorbidities and Cluster C had people with few leading indicators for the infection with few comorbidities. This was then validated against 300 participants whose data we collected as a part of a research study through our Covid-research tool and about 92% of them were classified correctly. CONCLUSION: A model was developed and validated that classifies people into Covid risk categories based on their symptoms. This can be used to monitor and track cases that rapidly transition into being symptomatic which eventually get tested positive for the infection in order to initiate early medical interventions. KEYWORDS: Covid-19, Synthetic Data, Patient Clustering, Unsupervised Learning, Risk Classification
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.