High-throughput sequencing of ribonucleic acid molecules is used increasingly to understand gene expression in organs, tissues, and therapies, at a single-cell level. To facilitate the discovery of the heterogeneity and cell-specific factors of the COVID-19 disease, we use an interpretable computational approach that derives cell mixtures from peripheral blood mononuclear cells of healthy donors, and influenza, asymptomatic, mild and severe COVID-19 patients. Cell mixtures are generated using hierarchical Bayesian modeling and are subsequently used as features in the gradient boosting tree classifier. Balanced accuracy of five-fold cross-validation was 68%, significantly higher than expected by random chance. Moreover, 11 out of 19 donors' samples were classified accurately. The main advantage of the mixture-based approach compared to the traditional featurebased classification, is its ability to capture associations between genes as well as between cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.