2017 IEEE International Conference on Big Data (Big Data) 2017
DOI: 10.1109/bigdata.2017.8258204
|View full text |Cite
|
Sign up to set email alerts
|

Uncovering the evolution history of data lakes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 24 publications
(12 citation statements)
references
References 11 publications
0
12
0
Order By: Relevance
“…For example, Singh et al show that Bayesian models allow detecting links between data attributes [50]. Similarly, several authors propose algorithms to discover schemas or constraints in semistructured data [3,27,44].…”
Section: Link_documentmentioning
confidence: 99%
“…For example, Singh et al show that Bayesian models allow detecting links between data attributes [50]. Similarly, several authors propose algorithms to discover schemas or constraints in semistructured data [3,27,44].…”
Section: Link_documentmentioning
confidence: 99%
“…For the task of schema evolution management, we use Darwin to initially declare or extract schemas, define schema evolution operations, or extract schema versions and schema evolution operations from legacy data. We have presented these functionalities in Klettke et al [7] and Störl et al [14]. In this present paper, we present a methodology of self-adapting data migration which builds on our demo paper about Mig-Cast [4], which focuses on data migration itself.…”
Section: Architecturementioning
confidence: 99%
“…Thus, to ensure data accessibility, exploration, and exploitation, an efficient and effective metadata system becomes an indispensible component in data lakes (Quix et al, 2016). Yet, most of the research work on data lakes still concentrate on structured data, or semi-structured data only (Farid et al, 2016;Farrugia et al, 2016;Madera and Laurent, 2016;Quix et al, 2016;Klettke et al, 2017). So far, unstructured data have not received enough consideration in the relevant research literature, while more often than not unstructured heterogeneous data occur frequently (Miloslavskaya and Tolstoy, 2016).…”
Section: Related Workmentioning
confidence: 99%