2018
DOI: 10.3390/app8081325
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid Prediction Model for Type 2 Diabetes and Hypertension Using DBSCAN-Based Outlier Detection, Synthetic Minority Over Sampling Technique (SMOTE), and Random Forest

Abstract: As the risk of diseases diabetes and hypertension increases, machine learning algorithms are being utilized to improve early stage diagnosis. This study proposes a Hybrid Prediction Model (HPM), which can provide early prediction of type 2 diabetes (T2D) and hypertension based on input risk-factors from individuals. The proposed HPM consists of Density-based Spatial Clustering of Applications with Noise (DBSCAN)-based outlier detection to remove the outlier data, Synthetic Minority Over-Sampling Technique (SMO… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
98
0
2

Year Published

2018
2018
2022
2022

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 171 publications
(101 citation statements)
references
References 56 publications
1
98
0
2
Order By: Relevance
“…The DBSCAN is a density-based clustering algorithm proposed by Sander, J. et al [26] in 1998, which is widely used in the fields of physics [27], computer science [28,29], medicine [30], architecture [31], agriculture [32] and so on. Compared to other clustering methods such as K-means and Gaussian mixtures, the advantages of the DBSCAN method lie in the following aspects: (1) It has better identification capability for abnormal points.…”
Section: Diagnosis Methodsmentioning
confidence: 99%
“…The DBSCAN is a density-based clustering algorithm proposed by Sander, J. et al [26] in 1998, which is widely used in the fields of physics [27], computer science [28,29], medicine [30], architecture [31], agriculture [32] and so on. Compared to other clustering methods such as K-means and Gaussian mixtures, the advantages of the DBSCAN method lie in the following aspects: (1) It has better identification capability for abnormal points.…”
Section: Diagnosis Methodsmentioning
confidence: 99%
“…However, the main limitation of this method is the cold-start problem, in which there is no accurate data generated when the initial input data are limited numbers. The reason is that SMOTE generates the random data in the nearest boundary of acquired data [ 36 ]. In small amounts of data, the boundary area is narrowed.…”
Section: Proposed Adaptive Data Boosting Methodologiesmentioning
confidence: 99%
“…Fazal proposes a Hybrid Prediction Model (HPM) [19]. This study analyzes a dataset to improve early diagnosis of Type 2 Diabetes and Hypertension.…”
Section: Related Workmentioning
confidence: 99%