2017
DOI: 10.1371/journal.pone.0179805
|View full text |Cite
|
Sign up to set email alerts
|

Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project

Abstract: Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
159
0
3

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 233 publications
(164 citation statements)
references
References 54 publications
2
159
0
3
Order By: Relevance
“…MLAs have been shown to improve precision in identifying individuals at risk of disease. (5)(6)(7)(8)(9)(10)…”
Section: Advantages Of MLmentioning
confidence: 99%
“…MLAs have been shown to improve precision in identifying individuals at risk of disease. (5)(6)(7)(8)(9)(10)…”
Section: Advantages Of MLmentioning
confidence: 99%
“…Therefore, balancing the training data would improve the classification algorithm performances. Two methods commonly used for solving the class imbalanced problem are the Random under-sampling (RUS) and Synthetic Minority Oversampling Techniques (SMOTE) [71]. Random undersampling methods reduce the majority classes to equal the number of minority classes.…”
Section: Activity Class Imbalanced Distributionmentioning
confidence: 99%
“…Here, we utilize the SMOTE [25] to increase the minority activity classes following a recent study in human activity recognition [6]. We oversampled the minority classes such as Descending stairs, Jumping and descending stairs to solve the problem related to imbalanced dataset [71]. Figure 2 shows the activity class imbalanced distribution in Dataset 1.…”
Section: Activity Class Imbalanced Distributionmentioning
confidence: 99%
See 1 more Smart Citation
“…Machine learning (ML) techniques are increasingly being used to analyze electronic health record data to predict future disease onset or its future course [9][10][11][12][13] . These efforts include prediction of onset and complications of cardiovascular disease [14][15][16][17][18][19][20][21] , onset of T2D [22][23][24][25][26] , onset of kidney disease 27 , as well as prediction of postoperative outcomes [28][29][30][31][32] , birth related outcomes 33,34 , mortality 15,35,36 and hospital readmissions [37][38][39][40][41][42][43] . However, current approaches typically suffer from a number of limitations.…”
Section: Introductionmentioning
confidence: 99%