2022
DOI: 10.18196/jrc.v3i2.13133
|View full text |Cite
|
Sign up to set email alerts
|

Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare

Abstract: Missing data is one of the most common issues encountered in data cleaning process especially when dealing with medical dataset. A real collected dataset is prone to be incomplete, inconsistent, noisy and redundant due to potential reasons such as human errors, instrumental failures, and adverse death. Therefore, to accurately deal with incomplete data, a sophisticated algorithm is proposed to impute those missing values. Many machine learning algorithms have been applied to impute missing data with plausible … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(19 citation statements)
references
References 81 publications
0
18
0
1
Order By: Relevance
“…showed the robustness of RF, as it was deemed the most appropriate MVI method in proteomics when put against data with different types of missingness [32]. Furthermore, in a comparison between RF, ensemble ANN and SVM in healthcare data, RF resulted in the highest performance of the three, with ensemble SVM on the opposing end [95].…”
Section: Imputation Methodsmentioning
confidence: 99%
“…showed the robustness of RF, as it was deemed the most appropriate MVI method in proteomics when put against data with different types of missingness [32]. Furthermore, in a comparison between RF, ensemble ANN and SVM in healthcare data, RF resulted in the highest performance of the three, with ensemble SVM on the opposing end [95].…”
Section: Imputation Methodsmentioning
confidence: 99%
“…These methods are fast and easily interpretable, but may lead to low accuracy and biased estimates of the investigated associations [ 81 , 86 ]. Alternatively, more sophisticated model-based imputation techniques can be used: with these approaches, a predictive model—based, for instance, on regression techniques [ 87 , 88 ], Artificial Neural Networks [ 89 , 90 ], or k-Nearest Neighbors [ 91 , 92 ]—is created to estimate values that will replace missing data [ 93 , 94 ].…”
Section: Tip 4: Look For Missing Data and Handle Them Properlymentioning
confidence: 99%
“…Preprocessing data dengan strategi sampling digunakan untuk mengatasi ketidakseimbangan kelas dengan mengeliminasi beberapa data dari kelas mayoritas (undersampling) atau menambahkan beberapa data menggunakan hasil dari proses generated atau duplikat data ke kelas minoritas (Oversampling) [5]. Dari dua strategi resampling ini, undersampling telah terbukti menjadi pilihan yang lebih baik daripada oversampling.…”
Section: Pendahuluanunclassified