Optimizing RDF Storage Removing Redundancies: An Algorithm

Iannone, Luigi; Palmisano, Ignazio; Redavid, Domenico

doi:10.1007/11504894_101

Cited by 16 publications

(9 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An RDF graph has semantic redundancy when the information it contains can be represented with fewer triples. Semantic compressors are able to detect this type of redundancy and eliminate extra triples from the original dataset [21]. Then, using inference techniques, the original dataset can be recreated, or at least, a semantically equivalent graph can be obtained.…”

Section: Sources Of Rdf Redundanciesmentioning

confidence: 99%

“…These compressors propose different strategies to detect redundant triples (those that could be inferred) and to obtain the canonical subgraphs, which are finally encoded. Initial approaches [21,34] consider the notion of lean subgraph. This concept refers to the smallest instance of the original graph which preserves the ground part of the graph (non-blank nodes and edges connecting them), and maps redundant blank nodes to labels already existing in the graph or to other blank nodes.…”

Section: Rdf Compressionmentioning

confidence: 99%

See 1 more Smart Citation

RDF-TR: Exploiting structural redundancies to boost RDF compression

Hernández-Illera

Martínez‐Prieto

Fernández

2020

Information Sciences

View full text Add to dashboard Cite

Section: Sources Of Rdf Redundanciesmentioning

confidence: 99%

Section: Rdf Compressionmentioning

confidence: 99%

RDF-TR: Exploiting structural redundancies to boost RDF compression

Hernández-Illera

Martínez‐Prieto

Fernández

2020

Information Sciences

View full text Add to dashboard Cite

“…This was because no recommendation, at the time of writing, has been completed by W3C for RDF description querying (SPARQL 6 is at the Working Draft stage of its evolution); thus, different solutions were developed, each one with its own query language and related optimizations. Some members of RDF Data Access Group issued a report 7 in which six query engines were examined aiming to compare different expressive power of the underlying query languages. Actually, many different triple storage strategies are available.…”

Section: Related Workmentioning

confidence: 99%

“…The RDFCore component, presented in [2,7], is based on two classes, DescriptionManager and TripleManager, and an interface, RDFEngineInterface.…”

Section: The Rdfcore Componentmentioning

confidence: 99%

REDD: An Algorithm for Redundancy Detection in RDF Models

Esposito

Iannone

Palmisano

et al. 2005

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. The base of Semantic Web specifications is Resource Description Framework (RDF) as a standard for expressing metadata. RDF has a simple object model, allowing for easy design of knowledge bases. This implies that the size of knowledge bases can dramatically increase; therefore, it is necessary to take into account both scalability and space consumption when storing such bases. Some theoretical results related to blank node semantics can be exploited in order to design techniques that optimize, among others, space requirements in storing RDF descriptions. We present an algorithm, called REDD, that exploits these theoretical results and optimizes the space used by a RDF description. MotivationThe realization of the Semantic Web (SW) vision [1] needs ontologies for generating or interpreting (semantic) metadata for resources. It is fundamental to have ontology creation and integration steps in order to share structural knowledge between ontology designers and users. Ontologies are to be expressed in RDF according to SW specifications, using languages such as RDFS 1 and OWL. 2 It is important to note that both RDFS and OWL ontologies can be expressed as RDF graphs, so that ontologies can be treated exactly as other RDF models. In RDF design, the least power principle was applied: data structures are to be kept as simple as possible. This imposes to have very simple basic components, that are URIs 3 , blank nodes and statements (or triples). These design decisions have the drawback that RDF descriptions tend to grow fast as the complexity of the knowledge they represent increases. This observation encourages SW research to investigate toward the most effective storage solutions for RDF knowledge bases, in order to minimize required space. Intuitively, the lesser the number of triples a software (say, a query engine) has to examine, the faster it will process them.

show abstract

“…The compression is achieved due to a compact form representation rather than a reduction in the number of triples. [13] introduced the notion of a lean graph which is obtained by eliminating triples which contain blank nodes that specify redundant information. [19] proposed a user-specific redundancy elimination technique based on rules.…”

Section: Introductionmentioning

confidence: 99%

Logical Linked Data Compression

Joshi

Hitzler

Dong

2013

The Semantic Web: Semantics and Big Data

View full text Add to dashboard Cite

Abstract. Linked data has experienced accelerated growth in recent years. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets, called Rule Based Compression (RB Compression) that compresses datasets by generating a set of new logical rules from the dataset and removing triples that can be inferred from these rules. Unlike other compression techniques, our approach not only takes advantage of syntactic verbosity and data redundancy but also utilizes semantic associations present in the RDF graph. Depending on the nature of the dataset, our system is able to prune more than 50% of the original triples without affecting data integrity.

show abstract

Optimizing RDF Storage Removing Redundancies: An Algorithm

Cited by 16 publications

References 2 publications

RDF-TR: Exploiting structural redundancies to boost RDF compression

RDF-TR: Exploiting structural redundancies to boost RDF compression

REDD: An Algorithm for Redundancy Detection in RDF Models

Logical Linked Data Compression

Contact Info

Product

Resources

About