22nd International Conference on Data Engineering Workshops (ICDEW'06) 2006
DOI: 10.1109/icdew.2006.54
|View full text |Cite
|
Sign up to set email alerts
|

Efficiently Computing Inclusion Dependencies for Schema Discovery

Abstract: Large data integration projects must often cope with undocumented data sources. Schema discovery aims at automatically finding structures in such cases. An important class of relationships between attributes that can be detected automatically are inclusion dependencies (IND), which provide an excellent basis for guessing foreign key constraints. INDs can be discovered by comparing the sets of distinct values of pairs of attributes.In this paper we present efficient algorithms for finding unary INDs. We first s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(32 citation statements)
references
References 13 publications
0
32
0
Order By: Relevance
“…Therefore, many studies have addressed the problem of helping users find integrity constraints from an existing data instance. However, most existing techniques address the problem of supporting the discovery of data integrity constraints in the context of relational databases [1] [2]. To the best of our knowledge, only a few papers address the problem of supporting the discovery of data integrity constraints in the Web context.…”
Section: Introductionmentioning
confidence: 99%
See 4 more Smart Citations
“…Therefore, many studies have addressed the problem of helping users find integrity constraints from an existing data instance. However, most existing techniques address the problem of supporting the discovery of data integrity constraints in the context of relational databases [1] [2]. To the best of our knowledge, only a few papers address the problem of supporting the discovery of data integrity constraints in the Web context.…”
Section: Introductionmentioning
confidence: 99%
“…To the best of our knowledge, the complexity of the fastest algorithm to check if a pair (e i , e j ) is an inclusion is in O(n) for the size of the sets [2] under the assumption that we sort the words in e i and e j before the calculation.…”
Section: Strict Comparisonmentioning
confidence: 99%
See 3 more Smart Citations