2021
DOI: 10.1093/ajcp/aqab086
|View full text |Cite
|
Sign up to set email alerts
|

A Machine Learning Model to Successfully Predict Future Diagnosis of Chronic Myelogenous Leukemia With Retrospective Electronic Health Records Data

Abstract: Background Chronic myelogenous leukemia (CML) is a clonal stem cell disorder accounting for 15% of adult leukemias. We aimed to determine if machine learning models could predict CML using blood cell counts prior to diagnosis. Methods We identified patients with a diagnostic test for CML (BCR-ABL1) and at least 6 consecutive prior years of differential blood cell counts between 1999 and 2020 in the largest integrated health c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
17
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(18 citation statements)
references
References 22 publications
0
17
1
Order By: Relevance
“…In particular, the ML models could predict CML using blood cell counts prior to diagnosis. These findings indicate that a ML model trained with blood cell counts can lead to diagnosis of CML earlier in the disease course as compared to usual medical care [234]. Other authors have recently developed a leukemia artificial intelligence program (LEAP) using the Extreme Gradient Boosting (XGBoost) decision tree method for the optimal treatment recommendation of tyrosine kinase inhibitors (TKIs) in patients with CML-CP.…”
Section: Bioinformatics and Artificial Intelligence As Methodologies To Decipher Mechanisms Of Action Or Resistance To Tkis In Leukemiamentioning
confidence: 97%
See 1 more Smart Citation
“…In particular, the ML models could predict CML using blood cell counts prior to diagnosis. These findings indicate that a ML model trained with blood cell counts can lead to diagnosis of CML earlier in the disease course as compared to usual medical care [234]. Other authors have recently developed a leukemia artificial intelligence program (LEAP) using the Extreme Gradient Boosting (XGBoost) decision tree method for the optimal treatment recommendation of tyrosine kinase inhibitors (TKIs) in patients with CML-CP.…”
Section: Bioinformatics and Artificial Intelligence As Methodologies To Decipher Mechanisms Of Action Or Resistance To Tkis In Leukemiamentioning
confidence: 97%
“…A recent study that also used ML models was able to predict future diagnosis of CML based on the analysis of data from retrospective electronic health records [234]. In particular, the ML models could predict CML using blood cell counts prior to diagnosis.…”
Section: Bioinformatics and Artificial Intelligence As Methodologies To Decipher Mechanisms Of Action Or Resistance To Tkis In Leukemiamentioning
confidence: 99%
“…In the model selection process, 103 clinical variables were used to train 4 classification algorithms, random forest (RF), decision tree, support vector machine (SVM), and linear regression, to distinguish relapses from nonrelapses in the 3 clinically predefined risk categories: standard-, intermediate-, and high-risk levels. While Pan et al [ 131 ] built a model to predict disease relapse, Hauser et al [ 144 ] studied the possibility of predicting CML prior to diagnosis using only CBC test results and ML algorithms such as XGBoost and LASSO algorithms on 1623 patients with a definitive CML status. The variables used in the study included laboratory CBC test results, patient demographic features such as their age and gender, and patient encounter information (the number of patient visits to outpatient clinics, etc.).…”
Section: Discussionmentioning
confidence: 99%
“…However, the suggested approach is considered insufficient for internal validation that requires at least 50 repeats [ 145 ]. By contrast, the chosen data set in [ 144 ] was divided into 2 distinct groups: train/validation and test groups. While the latter split-sample validation approach seems reasonable and justifiable to use in this case given the large sample size, potential drawbacks may arise, and several aspects still need attention throughout the application.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation