In this article, we present a semantic-based approach for dealing with the interoperability issue in the conservation-restoration domain. We first evaluate the context and our observations confirm the critical need for a data integration system taking advantage of data semantics. Our solution consists in: (1) building a domain-specific ontology, to rely on a unified understanding of the conservation-restoration data; (2) mapping the shared ontology to each data source, allowing each participating source to manage its own semantic database, consisting of its original data now associated to the semantic level; and (3) integrating all sources’ data, for querying them in the same homogeneous way. The presented achievements have been conducted as part of the PARCOURS project, whose aim is to develop an information system able to provide a unified access to distinct information sources, related to the cultural heritage field in general and the conservation-restoration processes in particular.
Knowledge bases (KBs) such as DBpedia, Wikidata, and YAGO contain a huge number of entities and facts. Several recent works induce rules or calculate statistics on these KBs. Most of these methods are based on the assumption that the data is a representative sample of the studied universe. Unfortunately, KBs are biased because they are built from crowdsourcing and opportunistic agglomeration of available databases. This paper aims at approximating the representativeness of a relation within a knowledge base. For this, we use the generalized Benford's law, which indicates the distribution expected by the facts of a relation. We then compute the minimum number of facts that have to be added in order to make the KB representative of the real world. Experiments show that our unsupervised method applies to a large number of relations. For numerical relations where ground truths exist, the estimated representativeness proves to be a reliable indicator.
As the Linked Open Data and the number of semantic web data providers hugely increase, so does the critical importance of the following question: how to get usable results, in particular for data mining and data analysis tasks? We propose a query framework equiped with integrity constraints that the user wants to be verified on the results coming from semantic web data providers. We precise the syntax and semantics of those user quality constraints. We give algorithms for their dynamic verification during the query computation, we evaluate their performance with experimental results, and discuss related works.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.