A complicated clinical course for critically ill patients admitted to the intensive care unit (ICU) usually includes multiorgan dysfunction and subsequent death. Owing to the heterogeneity, complexity, and unpredictability of the disease progression, ICU patient care is challenging. Identifying the predictors of complicated courses and subsequent mortality at the early stages of the disease and recognizing the trajectory of the disease from the vast array of longitudinal quantitative clinical data is difficult. Therefore, we attempted to perform a meta-analysis of previously published gene expression datasets to identify novel early biomarkers and train the artificial intelligence systems to recognize the disease trajectories and subsequent clinical outcomes. Using the gene expression profile of peripheral blood cells obtained within 24 h of pediatric ICU (PICU) admission and numerous clinical data from 228 septic patients from pediatric ICU, we identified 20 differentially expressed genes predictive of complicated course outcomes and developed a new machine learning model. After 5-fold cross-validation with 10 iterations, the overall mean area under the curve reached 0.82. Using a subset of the same set of genes, we further achieved an overall area under the curve of 0.72, 0.96, 0.83, and 0.82, respectively, on four independent external validation sets. This model was highly effective in identifying the clinical trajectories of the patients and mortality. Artificial intelligence systems identified eight out of twenty novel genetic markers (SDC4, CLEC5A, TCN1, MS4A3, HCAR3, OLAH, PLCB1, and NLRP1) that help predict sepsis severity or mortality. While these genes have been previously associated with sepsis mortality, in this work, we show that these genes are also implicated in complex disease courses, even among survivors. The discovery of eight novel genetic biomarkers related to the overactive innate immune system, including neutrophil function, and a new predictive machine learning method provides options to effectively recognize sepsis trajectories, modify real-time treatment options, improve prognosis, and patient survival.
Identifying cancer-causing mutations from sequenced cancer genomes hold much promise for targeted therapy and precision medicine. “Driver” mutations are primarily responsible for cancer progression, while “passengers” are functionally neutral. Although several computational approaches have been developed for distinguishing between driver and passenger mutations, very few have concentrated on using the raw nucleotide sequences surrounding a particular mutation as potential features for building predictive models. Using experimentally validated cancer mutation data in this study, we explored various string-based feature representation techniques to incorporate information on the neighborhood bases immediately 5′ and 3′ from each mutated position. Density estimation methods showed significant distributional differences between the neighborhood bases surrounding driver and passenger mutations. Binary classification models derived using repeated cross-validation experiments provided comparable performances across all window sizes. Integrating sequence features derived from raw nucleotide sequences with other genomic, structural, and evolutionary features resulted in the development of a pan-cancer mutation effect prediction tool, NBDriver, which was highly efficient in identifying pathogenic variants from five independent validation datasets. An ensemble predictor obtained by combining the predictions from NBDriver with three other commonly used driver prediction tools (FATHMM (cancer), CONDEL, and MutationTaster) significantly outperformed existing pan-cancer models in prioritizing a literature-curated list of driver and passenger mutations. Using the list of true positive mutation predictions derived from NBDriver, we identified a list of 138 known driver genes with functional evidence from various sources. Overall, our study underscores the efficacy of using raw nucleotide sequences as features to distinguish between driver and passenger mutations from sequenced cancer genomes.
A complicated clinical course for critically ill patients admitted to the ICU usually includes multiorgan dysfunction and subsequent death. Owning to the heterogeneity, complexity, and unpredictability of the disease progression, patient care is challenging. Identifying the predictors of complicated courses and subsequent mortality at the early stages of the disease and recognizing the trajectory of the disease from the vast array of longitudinal quantitative clinical data is difficult. Therefore, we attempted to identify novel early biomarkers and train the artificial intelligence systems to recognize the disease trajectories and subsequent clinical outcomes. Using the gene expression profile of peripheral blood cells obtained within 24 hours of PICU admission and numerous clinical data from 228 septic patients from pediatric ICU, we identified 20 differentially expressed genes that were predictive of complicated course outcomes and developed a new machine learning model. After 5-fold cross-validation with ten iterations, the overall mean area under the curve reached 0.82. This model was highly effective in identifying the clinical trajectories of the patients and mortality. Artificial intelligence systems identified eight out of twenty novel genetic markers SDC4 , CLEC5A , TCN1 , MS4A3 , HCAR3 , OLAH , PLCB1 and NLRP1 that help to predict sepsis severity or mortality. The discovery of eight novel genetic biomarkers related to the overactive innate immune system and neutrophils functions, and a new predictive machine learning method provides options to effectively recognize sepsis trajectories, modify real-time treatment options, improve prognosis, and patient survival.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.