2014 IEEE 30th International Conference on Data Engineering 2014
DOI: 10.1109/icde.2014.6816740
|View full text |Cite
|
Sign up to set email alerts
|

Profiling and mining RDF data with ProLOD++

Abstract: Before reaping the benefits of open data to add value to an organizations internal data, such new, external datasets must be analyzed and understood already at the basic level of data types, constraints, value patterns etc. Such data profiling, already difficult for large relational data sources, is even more challenging for RDF datasets, the preferred data model for linked open data.We present ProLOD++, a novel tool for various profiling and mining tasks to understand and ultimately improve open RDF data. Pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 52 publications
(36 citation statements)
references
References 15 publications
0
36
0
Order By: Relevance
“…We have also provided experiments that clearly demonstrate the speed-up ensured by our solutions over comparative approaches. This offers exciting possibilities towards the support of RDF graph analysis and mining methodologies (e.g., [80,81]) over large-scale data sets, thanks to the powerful run-time support offered by MapReduce.…”
Section: Discussionmentioning
confidence: 99%
“…We have also provided experiments that clearly demonstrate the speed-up ensured by our solutions over comparative approaches. This offers exciting possibilities towards the support of RDF graph analysis and mining methodologies (e.g., [80,81]) over large-scale data sets, thanks to the powerful run-time support offered by MapReduce.…”
Section: Discussionmentioning
confidence: 99%
“…Another example of clustering in the context of data profiling is ProLOD++, which applies k-means clustering to Rdf relations [1]. We refer the reader to surveys by Jain et al [78] and Xu and Wunsch II [137] for more details on clustering algorithms for relational data.…”
Section: Clustering and Outlier Detectionmentioning
confidence: 99%
“…Most data cleansing tools can then either transform differently formatted numbers or mark them as improper. 1 See Sect. 6 for a more comprehensive list of tools.…”
Section: Data Profiling: Finding Metadatamentioning
confidence: 99%
See 2 more Smart Citations