Objective The survival of patients with lymphoma varies greatly among individuals and were affected by various factors. The aim of this study was to develop and validate a prognostic model for predicting overall survival (OS) in patients with lymphoma. Methods We conducted a prospective longitudinal cohort study in China between January 2014 and December 2018 (n = 1,594). After obtaining the follow-up data, we randomly split the cohort into the training cohort (n = 1,116) and the validation cohort (n = 478). The least absolute shrinkage and selection operator (LASSO) regression analysis was used to select the predictors of the model. Cox stepwise regression analysis was used to identify independent prognostic factors, which were finally displayed as static nomogram and web-based dynamic nomogram. We calculated the concordance index(C-index) to describe how the predicted survival of objectively confirmed prognosis. The calibration plot is used to evaluate the prediction accuracy and discrimination ability of the model. Net reclassification index (NRI) and decision curve analysis (DCA) curves were also used to evaluate the prediction ability and net benefit of the model. Results Nine variables in the training cohort were considered to be independent risk factors for patients with lymphoma in the final model: age, Ann Arbor Stage, pathologic type, B symptoms, chemotherapy, targeted therapy, lactate dehydrogenase (LDH), β2-microglobulin and C-reactive protein (CRP). The C-indices of OS were 0.749 (95% CI, 0.729–0.769) in the training cohort and 0.731 (95% CI, 0.762–0.700) in the validation cohort. A good agreement between prediction by nomogram and actual observation was shown in the calibration curve for the probability of survival in both the training cohort and validation cohorts. The areas under curve (AUC) of the area under the receiver operating characteristic (ROC) curves for 1-year, 3-year, and 5-year OS were 0.813, 0.800, and 0.762, respectively, in the training cohort, and 0.802, 0.768, and 0.721, respectively, in the validation cohort. Compared with the Ann Arbor Stage system, NRI and DCA showed that the model had a higher predictive capacity and net benefit. Conclusion The prediction models reliably estimate the outcome of patients with lymphoma. The model had high discrimination and calibration, which provided a simple and reliable tool for the survival prediction of the patients, and it might help patients benefit from personalized intervention.
BackgroundPatients with non-small cell lung cancer (NSCLC) often have a poor prognosis. Overall survival (OS) prediction through the early diagnosis of cancer has many benefits, such as allowing providers to design the best treatment plan for patients. In this study, we aimed to evaluate the prognostic factors in NSCLC patients, construct a nomogram, and develop machine learning models to predict the OS. We also conducted feature importance analysis to understand how relevant factors of NSCLC patients impact their OS.ResultsMultiple machine learning models were adopted in a retrospective cohort of patients from 2010 to 2015 in the Surveillance, Epidemiology, and End Results (SEER) database. Independent prognostic factors for NSCLC were determined using Cox proportional hazards regression analysis. We modeled OS and vital status as the outcomes and constructed and validated a nomogram to predict the OS of NSCLC. Furthermore, we applied logistic regression, random forest, XGBoost, decision tree, multilayer perceptron, and LightGBM to predict the patients’ vital status. We tested the prediction ability of the models and evaluated their performances using accuracy, sensitivity, specificity, precision, and the area under the receiver operating characteristic curve. A total of 34,567 patients selected from the SEER database that met our criteria were included in this study. The nomogram visualized the OS prediction results of the Cox regression model. Among the classifiers, XGBoost had the best prediction performance, with an area under the curve of 0.733.ConclusionsThe results demonstrated that machine learning-based classifier models are capable of predicting the outcomes of patients with NSCLC. And Cox regression model-based nomogram interpreted the results well and supports potential medical applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.