2011
DOI: 10.1007/978-3-642-23544-3_11
|View full text |Cite
|
Sign up to set email alerts
|

Support for User Involvement in Data Cleaning

Abstract: Abstract. Data cleaning and ETL processes are usually modeled as graphs of data transformations. The involvement of the users responsible for executing these graphs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. In this paper, in order to better support the user involvement in data cleaning processes, we equip a data cleaning graph with data quality constraints to help users identifying the points of the graph and the records th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…Pre-processing is an important step that enhances the quality and produces an image in which minutiae can be detected correctly [2]. A data cleaning graph with data quality constraints is used to help users in identifying the points of the graph, and the records need the attention towards manual data repairs to represent the required feedback to clean data items manually [3]. In this paper, three major data mining methods, namely functional dependency mining, association rule mining and Bagging SVMs for data cleaning are discussed [4].…”
Section: A Back Groundmentioning
confidence: 99%
“…Pre-processing is an important step that enhances the quality and produces an image in which minutiae can be detected correctly [2]. A data cleaning graph with data quality constraints is used to help users in identifying the points of the graph, and the records need the attention towards manual data repairs to represent the required feedback to clean data items manually [3]. In this paper, three major data mining methods, namely functional dependency mining, association rule mining and Bagging SVMs for data cleaning are discussed [4].…”
Section: A Back Groundmentioning
confidence: 99%
“…By collecting the multi-platform information established by the administrative departments of health and family planning at all levels and the departments of politics, public security, civil affairs, human resources, social security, and the Disabled Persons’ Federation, data collection and analysis are carried out [ 13 , 14 ]. Classified according to risk factors.…”
Section: Design Of Early Warning Monitoring Modelmentioning
confidence: 99%
“…Examples in the e-commerce related area include DEXTER [13] and DI-ADEM [7] that parse e-commerce websites automatically to extract structured product data. There are also examples such as Wrangler [12] and user-involved data cleaning system [9] that leverage human feedback to construct data transformations. In this paper, schema auto generation is not our focus due to the limited scope of the system -it is designed to solve the internal needs of a common data model for one company, which is more suitable for manual design and update.…”
Section: Related Workmentioning
confidence: 99%