Background: Pneumonia accounts for the majority of infection-related deaths after kidney transplantation.We aimed to build a predictive model based on machine learning for severe pneumonia in recipients of deceased-donor transplants within the perioperative period after surgery. Methods:We collected the features of kidney transplant recipients and used a tree-based ensemble classification algorithm (Random Forest or AdaBoost) and a nonensemble classifier (support vector machine, Naïve Bayes, or logistic regression) to build the predictive models. We used the area under the precisionrecall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC) to evaluate the predictive performance via ten-fold cross validation.Results: Five hundred nineteen patients who underwent transplantation from January 2015 to December 2018 were included. Forty-three severe pneumonia episodes (8.3%) occurred during hospitalization after surgery. Significant differences in the recipients' age, diabetes status, HBsAg level, operation time, reoperation, usage of anti-fungal drugs, preoperative albumin and immunoglobulin levels, preoperative pulmonary lesions, and delayed graft function, as well as donor age, were observed between patients with and without severe pneumonia (P<0.05). We screened eight important features correlated with severe pneumonia using the recursive feature elimination method and then constructed a predictive model based on these features. The top three features were preoperative pulmonary lesions, reoperation and recipient age (with importance scores of 0.194, 0.124 and 0.078, respectively). Among the machine learning algorithms described above, the Random Forest algorithm displayed better predictive performance, with a sensitivity of 0.67, specificity of 0.97, positive likelihood ratio of 22.33, negative likelihood ratio of 0.34, AUROC of 0.91, and AUPRC of 0.72. Conclusions:The Random Forest model is potentially useful for predicting severe pneumonia in kidney transplant recipients. Recipients with a potential preoperative potential pulmonary infection, who are of older age and who require reoperation should be monitored carefully to prevent the occurrence of severe pneumonia.
Background: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance from large samples of clinical data in the real world.Methods: By applying machine learning methods, including minimum description length (MDL), naive Bayesian (NB), K-means (KM), nonnegative matrix factorization (NMF), and decision tree (DT), based on large sample data of 2,502 patients, we built a classification model and systematically explored differences in fibrinogen levels in different clinical stages of lung cancer between the sexes. We also validated the reliability of the model by testing it on a validation cohort of 447 patients. This report adheres to the "Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis" (TRIPOD) statement for the reporting of prediction models.Results: The analysis revealed significant differences in fibrinogen levels, pleural effusion, chlorine levels, A-G ratio, glutamic-oxaloacetic transaminase and alkaline phosphatase levels as well as in sex composition between the early-stage lung cancer group and the middle-late-stage lung cancer group. The classification model created by the combination of fibrinogen, alkaline phosphatase and sex demonstrated good performance with an AUC of 73.5%. In addition, in males, a fibrinogen level of 2.94 g/L could initially serve as the upper limit for determining the early-stage lung cancer group, but a level of 3.91 g/L could be preliminarily used as a reference threshold for the lower limit for middle-to late-stage lung cancer. This latter level could also serve as the upper limit of the critical value for early-stage lung cancer in females.Conclusions: An integrated application based on supervised and unsupervised machine learning algorithms could effectively explore the potential links contained in the clinical data and reveal the differences in fibrinogen levels in different clinical stages of lung cancer between the sexes, which could provide a new reference basis for lung cancer staging.
Background: Postoperative blood coagulation assessment of children with congenital heart disease (CHD) has been developed using a conventional statistical approach. In this study, the machine learning (ML) was used to predict postoperative blood coagulation function of children with CHD, and assess an array of ML models. Methods: This was a retrospective and data mining study. Based on the samples of 1,690 children with CHD, and screening data based on demographic characteristics, conventional coagulation tests (CCTs) and complete blood count (CBC), with a precise data selection process, and the support of data mining and ML algorithms including Decision tree, Naive Bayes, Support Vector Machine (SVM), Adaptive Boost (AdaBoost) and Random Forest model, and explored the best prediction models of postoperative blood coagulation function for children with CHD by models performance measured in the area under the receiver operating characteristic (ROC) curve (AUC), calibration or Lift curves, and further verified the reliability of the models with statistical tests.Results: In primary objective prediction, as decision tree, Naive Bayes, SVM, the AUC of our prediction algorithm was 0.81, 0.82, 0.82, respectively. The accuracy rate of the overall forecast has reached more than 75%. Subsequently, we furtherly build improved models. Among them, the true positive rate of the AdaBoost, Random Forest and SVM prediction models reached more than 80% in the ROC curve. These overall accuracy rate indicated a good classification model. Combined calibration curves and Lift curves, the better fit is the SVM model, which predicted postoperative abnormal coagulation, Lift =2.2, postoperative normal coagulation, Lift =1.8. The statistical results furtherly proved the reliability of ML models. The age, sex, mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), white blood cell count (WBC) and platelet count (PLT) were the key features for predicting the postoperative blood coagulation state of children with CHD. Conclusions: ML technology and data mining algorithms may be used for outcome prediction in children with CHD for postoperative blood coagulation state based on the bulk of clinical data, especially CBC indictors from the real world.
Background and objectives Immunoglobulin a nephropathy (IgAN) is the most common primary glomerular disease in the world, with different clinical manifestations, varying severity of pathological changes, common complications of crescent formation in different proportions, and great individual heterogeneous in clinical outcomes. Therefore, we aim to develop a machine learning (ML) based predictive model for predicting the prognosis of IgAN with focal crescent formation and without obvious chronic renal lesions (glomerulosclerosis <25%). Materials We retrospectively reviewed biopsy-proven IgAN patients in our hospital and cooperative hospital from 2005 to 2017. The method of feature importance of random forest (RF) was applied to conduct feature exploration of feature variables to establish the characteristic variables that are closely related to the prognosis of focal crescent IgAN. Multiple ML algorithms were attempted to establish the prediction models. The area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC) were applied to evaluate the predictive performance via three-fold cross validation (namely 2 training sets and 1 validation set). Results RF was used to screen the important features, the top three of which were baseline estimated glomerular filtration rate (eGFR), serum creatine and triglyceride. Ten important features were selected as important predictors for modeling on the basis of data-driven and medical selection, predictors include: age, baseline eGFR, serum creatine, serum triglycerides, complement 3(C3), proteinuria, mean arterial pressure (MAP) and Hematuria, crescents proportion of glomeruli, Global crescent proportion of glomeruli. In a variety of ML algorithms, the support vector machine (SVM) algorithm displayed better predictive performance, with Precision of 0.77, Recall of 0.77, F1-score of 0.73, accuracy of 0.77, AUROC of 79.57%, and AUPRC of 76.5%. Conclusions The SVM model is potentially useful for predicting the prognosis of IgAN patients with focal crescent shape and without obvious chronic renal lesions.
Purpose To investigate the expression of heat shock protein 90α (HSP90α) in patients with lung cancer (LC) and the clinical value of HSP90α and other related markers in the diagnosis of LC. Methods Of 335 patients enrolled in the study cohort, 175 were screened for LC and 160 were healthy (HC). The plasma levels of HSP90α and related markers (CEA, NSE, CYFRA21‐1 and ProGRP) were detected in all individuals in the cohort by enzyme‐linked immunosorbent assay (ELISA). Groups were divided according to gender (male/female), age (≤60 years/>60 years), types of LC (small‐cell carcinoma, squamous carcinoma and adenocarcinoma), staging (I, II, III and IV) and metastasis (metastasis and non‐metastasis) separately. Wilcoxon Mann–Whitney test and Kruskal–Wallis test were used to compare statistical differences between two groups/among the multiple groups for each factor of HSP90α. The r ‐value and Kappa were used to compare HSP90α with related markers, and the receiver operating curve (ROC) was used to evaluate the efficacy of plasma HSP90α in predicting LC. Results No statistical difference was found in the plasma level of HSP90α among different age and gender groups ( p > 0.05). In the group divided by LC type, staging and metastasis status, there were statistical differences among different groups in HSP90α level ( p < 0.05). The levels of HSP90α, CEA, NSE, CYFRA21‐1 and ProGRP in LC groups were significantly higher than HC ( p < 0.001). R values of HSP90α correlated with other related markers in the diagnosis of LC ( p < 0.05). Although HSP90α and other related markers did not fit the satisfactory conformance, in terms of the positive rate of diagnosis, it was statistically differences in the diagnostic positive rate between HSP90α and each marker ( p < 0.01). ROC analysis showed that a plasma HSP90α cut‐off point of 50.02 ng/ml had an optimal predictive value for LC. Conclusions HSP90α has significant clinical value in early screening and diagnosis of LC. The combined application of HSP90α and related markers can improve the positive rate of early diagnosis of LC effectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.