2017
DOI: 10.1177/0165551516677945
|View full text |Cite
|
Sign up to set email alerts
|

Characterising RDF data sets

Abstract: The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(32 citation statements)
references
References 27 publications
0
32
0
Order By: Relevance
“…It is nonetheless true that, given the descriptive character of RDF, there exist predicate repetitions when describing resources of the same nature (e.g., among people). Although the number of predicate combinations (referred to as predicate families) used for subject descriptions theoretically grows with the number of predicates, the number of such combinations is bounded, even in datasets with a light schema [15]. In the following, we formalize the concept of predicate family based on the notion of predicate lists [15].…”
Section: Predicatesmentioning
confidence: 99%
See 1 more Smart Citation
“…It is nonetheless true that, given the descriptive character of RDF, there exist predicate repetitions when describing resources of the same nature (e.g., among people). Although the number of predicate combinations (referred to as predicate families) used for subject descriptions theoretically grows with the number of predicates, the number of such combinations is bounded, even in datasets with a light schema [15]. In the following, we formalize the concept of predicate family based on the notion of predicate lists [15].…”
Section: Predicatesmentioning
confidence: 99%
“…For example, as previously explained, it would be uncommon to find ''clint@eastwood.org'' as a value for a film duration, or ''Dead Man Walking'' as the family name of a person. In fact, it is usual that object values are related to a single predicate [15].…”
Section: Objectsmentioning
confidence: 99%
“…The literature has found that values in the range of 2 < α < 3 are typical in many real-world networks [15]. The scale-free behaviour also applies to some datasets and measures of RDF datasets [6,8]. However, to reason about whether a distribution follows a power-law can be technically challenging [1], and computing the exponent α, that falls into a certain range of values, is not sufficient.…”
Section: Degree-based Measuresmentioning
confidence: 99%
“…Examples of statistics are dataset size, property and vocabulary usage, data types used or average length of string literals. In terms of the topology of RDF graphs, previous works report on network measures mainly focusing on in-and out-degree distributions, reciprocity, and path lengths [2,8,9,21]. Nonetheless, the results of these studies are limited to a small fraction of the RDF datasets currently available.…”
Section: Introductionmentioning
confidence: 99%
“…Relatively li le is known about the structural properties of the Semantic Web. Early work in this area has observed the presence of power-law distributions and other network-based features, such as clustering coe cient and path lengths, over individual datasets of (up to) millions of triples [4,9,7]. It is not yet known whether the structural properties of the LOD Cloud are the same as the structural properties of individual datasets.…”
Section: Use Cases For Data Sciencementioning
confidence: 99%