2011
DOI: 10.14778/1952376.1952378
|View full text |Cite
|
Sign up to set email alerts
|

Guided data repair

Abstract: In this paper we present GDR, a Guided Data Repair framework that incorporates user feedback in the cleaning process to enhance and accelerate existing automatic repair techniques while minimizing user involvement. GDR consults the user on the updates that are most likely to be beneficial in improving data quality. GDR also uses machine learning methods to identify and apply the correct updates directly to the database without the actual involvement of the user on these specific updates. To rank potential upda… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
130
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 183 publications
(130 citation statements)
references
References 19 publications
0
130
0
Order By: Relevance
“…3) Model-based 2 (SCARE): This is another model-based repairing approach based on maximizing the correctness likelihood of replacement data given the data distribution, which is modelled using statistical machine learning techniques [12]. 4) Crowd-based (GuidedRepair): We implement the state-of-the-art crowdbased repairing method proposed in [13], which collects feedbacks from users to adaptively refine the training set for a repairing model. We first make a comprehensive comparison on the Precision, Recall and F1 of all the methods at an erroneous ratio of 10% on the two real data sets.…”
Section: Repairing Quality Evaluationmentioning
confidence: 99%
See 2 more Smart Citations
“…3) Model-based 2 (SCARE): This is another model-based repairing approach based on maximizing the correctness likelihood of replacement data given the data distribution, which is modelled using statistical machine learning techniques [12]. 4) Crowd-based (GuidedRepair): We implement the state-of-the-art crowdbased repairing method proposed in [13], which collects feedbacks from users to adaptively refine the training set for a repairing model. We first make a comprehensive comparison on the Precision, Recall and F1 of all the methods at an erroneous ratio of 10% on the two real data sets.…”
Section: Repairing Quality Evaluationmentioning
confidence: 99%
“…The third category of solutions are external source based repairing approaches, which leverage the information in reference master data set [5] or user's interaction data such as GuidedRepair [13] and NADEEF [3] for better data cleaning performance. However, the required external information is not always available and thus the methods can not be applied in general scenarios.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…FICSR [20] obtains user feedback in the form of query result ranking, taking into account matching constraints. Other works that also focus on data inconsistency include [26] and [14]. Our work focuses on improving correspondences at the metadata level, rather than data.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, [19] puts forward a system for guiding data repairing that explicitly involves the user in the process of checking the data repairs automatically produced by the algorithms introduced in [5]. In particular, the authors focused on ranking the repairs in such a way that the user effort spent in analyzing useless information is minimized.…”
Section: Related Workmentioning
confidence: 99%