2018
DOI: 10.1007/978-3-030-00063-9_17
|View full text |Cite
|
Sign up to set email alerts
|

A New Metadata Model to Uniformly Handle Heterogeneous Data Lake Sources

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
34
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 28 publications
(34 citation statements)
references
References 16 publications
0
34
0
Order By: Relevance
“…We proposed a metadata typology for data lakes. It is based on the object notion, which represents any set of homogeneous data [5], and the typology declines metadata into three categories. Intra-object metadata describe objects through versions, representations and various properties to name a few ; interobject metadata explain how objects are linked together ; and global metadata facilitate and improve data analyses and the use of the data lake in general.…”
Section: First Results and Future Outcomesmentioning
confidence: 99%
See 1 more Smart Citation
“…We proposed a metadata typology for data lakes. It is based on the object notion, which represents any set of homogeneous data [5], and the typology declines metadata into three categories. Intra-object metadata describe objects through versions, representations and various properties to name a few ; interobject metadata explain how objects are linked together ; and global metadata facilitate and improve data analyses and the use of the data lake in general.…”
Section: First Results and Future Outcomesmentioning
confidence: 99%
“…We believe that the analytic atom concept, combined with the object notion [5], will help us propose a metadata system for a data lake efficient enough to manage any type of data, Big Data included. Although several studies have already been carried out on metadata systems for data lakes, and some of them have proven their efficiency [9, 10, 2], we believe we can offer a more complete metadata system that offers all the features we consider essential, and thus completely meets our expectations.…”
Section: The Phd Projectmentioning
confidence: 99%
“…For example, a textual document can be represented without stopwords or as a bag of words. It is essential in the context of data lakes to at least partially structure unstructured data to allow their automated analysis [5]. Simultaneously storing several representations of the same data notably avoids repeating preprocessings and thus speeds up analyses.…”
Section: Expected Featuresmentioning
confidence: 99%
“…Thus, such features are often integrated together in a provenance tracking module [12,13,27]. Yet, we still consider that they remain different features since they are not systematically proposed together [3,5,26].…”
Section: Expected Featuresmentioning
confidence: 99%
See 1 more Smart Citation