2015
DOI: 10.17706/jcp.10.6.396-405
|View full text |Cite
|
Sign up to set email alerts
|

A Conceptual Framework for Data Quality in Knowledge Discovery Tasks (FDQ-KDT): A Proposal

Abstract: Large Volume of Data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through data mining and data science methodologies. Nevertheless these not tackle the issues in data quality clearly, leaving out relevant activities. We proposed a conceptual framework for data quality in knowledge discovery tasks based on CRIS… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
6
2

Relationship

5
3

Authors

Journals

citations
Cited by 15 publications
(15 citation statements)
references
References 24 publications
0
15
0
Order By: Relevance
“…Finally, the authors in [15] built a conceptual framework based on data quality issues mentioned in data mining methodologies such as CRISP-DM [8], SEMMA [39], KDD [7] and the Data Science Process [40]. Subsequently, the same authors [37] designed a data cleaning process in regression models.…”
Section: Data Quality Frameworkmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, the authors in [15] built a conceptual framework based on data quality issues mentioned in data mining methodologies such as CRISP-DM [8], SEMMA [39], KDD [7] and the Data Science Process [40]. Subsequently, the same authors [37] designed a data cleaning process in regression models.…”
Section: Data Quality Frameworkmentioning
confidence: 99%
“…The rules were constructed based on literature reviews about data cleaning tasks [15,52,[90][91][92][93][94]. The most representative rules are explained below.…”
Section: Class Name Class Attributes Instancesmentioning
confidence: 99%
“…Rotation forest (Corrales, Ledezma, & Corrales, 2015a) refers to a technique to generate an ensemble of classifiers, in which each base classifier is trained with a different set of extracted attributes. The main heuristic is to apply feature extraction and to subsequently reconstruct a full attribute set for each classifier in the ensemble.…”
Section: Rotation Forestmentioning
confidence: 99%
“…The studies present different approaches to solve issues in data quality such as: heterogeneity, outliers, noise, inconsistency, incompleteness, amount of data, redundancy and timeliness [7][8]. We conduct a systematic review based on methodology [9], for each data quality issues, drawn from four informational sources: ieee Xplore, Science Direct, Springer Link and Google.…”
Section: Data Quality Issues In Knowledge Discovery Tasksmentioning
confidence: 99%
“…In this paper we present a systematic review for data quality issues in knowledge discovery tasks as: heterogeneity, outliers, noise, inconsistency, incompleteness, amount of data, redundancy and timeliness which are defined in [7][8] and a case study in agricultural diseases: the coffee rust. This paper is organized as follows.…”
Section: Introductionmentioning
confidence: 99%