2015
DOI: 10.1007/978-3-319-24129-6_23
|View full text |Cite
|
Sign up to set email alerts
|

Mapping Large Scale Research Metadata to Linked Data: A Performance Comparison of HBase, CSV and XML

Abstract: OpenAIRE, the Open Access Infrastructure for Research in Europe, comprises a database of all EC FP7 and H2020 funded research projects, including metadata of their results (publications and datasets). These data are stored in an HBase NoSQL database, post-processed, and exposed as HTML for human consumption, and as XML through a web service interface. As an intermediate format to facilitate statistical computations, CSV is generated internally. To interlink the OpenAIRE data with related data on the Web, we ai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
11
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 10 publications
1
11
0
Order By: Relevance
“…The results, detailed in [19], prove that our XML→RDF approach works on large datasets. However, it needs more computation time than the mapping from HBase because it does not employ a parallel processing strategy.…”
Section: Figure 3: Accumulated Processing Time For the Openaire Datasetsupporting
confidence: 54%
See 2 more Smart Citations
“…The results, detailed in [19], prove that our XML→RDF approach works on large datasets. However, it needs more computation time than the mapping from HBase because it does not employ a parallel processing strategy.…”
Section: Figure 3: Accumulated Processing Time For the Openaire Datasetsupporting
confidence: 54%
“…In the OpenAIRE setting, we compared the performance and maintainability of our XML→RDF approach to mappings from an HBase NoSQL database and from CSV [19] 21 . The results, detailed in [19], prove that our XML→RDF approach works on large datasets.…”
Section: Figure 3: Accumulated Processing Time For the Openaire Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…The OpenAIRE metadata can be consumed via OAI-PMH, or, in an even more straightforward way, as linked data (cf. our previous work, Vahdati et al [27]). …”
Section: Discussionmentioning
confidence: 74%
“…Similarly, Vahdati et al [15] present a distributed approach for converting research metadata from HBase, CSV and XML formats to RDF. They use the MapReduce paradigm for processing a large volume of the data in parallel over multiple nodes.…”
Section: Related Workmentioning
confidence: 99%