2019
DOI: 10.12928/telkomnika.v17i4.12780
|View full text |Cite
|
Sign up to set email alerts
|

AUTO-CDD: automatic cleaning dirty data using machine learning techniques

Abstract: Cleaning the dirty data has become very critical significance for many years, especially in medical sectors. This is the reason behind widening research in this sector. To initiate the research, a comparison between currently used functions of handling missing values and Auto-CDD is presented.The developed system will guarantee to overcome processing unwanted outcomes in data Analytical process; second, it will improve overall data processing. Our motivation is to create an intelligent tool that will automatic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…The evaluation on the basis of Accuracy (ACC) value is executed. Accuracy measures the degree to which the instances correctly classified by machine learning algorithm and can be computed using a confusion matrix with (5) as follows [23]:…”
Section: Results and Analysismentioning
confidence: 99%
“…The evaluation on the basis of Accuracy (ACC) value is executed. Accuracy measures the degree to which the instances correctly classified by machine learning algorithm and can be computed using a confusion matrix with (5) as follows [23]:…”
Section: Results and Analysismentioning
confidence: 99%
“…Jesmeen et al 23 presented a comparison between currently used algorithms and their proposed tool, Auto-CDD, to handle missing values. The developed system improved overall data processing and guaranteed to overcome processing unwanted outcomes in data analysis.…”
Section: Related Workmentioning
confidence: 99%
“…It was proved by different resources about the loss of billions of dollars due to poor DQ [11], [12]. Low-level DQ can cause due to wrong or missing data and is very essential to handle this type of dataset [13]. It may lead to incorrect or misleading decisions, predictions, or instructions.…”
Section: Introductionmentioning
confidence: 99%