Abstract. While current approaches to ontology mapping produce good results by mainly relying on label and structure based similarity measures, there are several cases in which they fail to discover important mappings. In this paper we describe a novel approach to ontology mapping, which is able to avoid this limitation by using background knowledge. Existing approaches relying on background knowledge typically have one or both of two key limitations: 1) they rely on a manually selected reference ontology; 2) they suffer from the noise introduced by the use of semi-structured sources, such as text corpora. Our technique circumvents these limitations by exploiting the increasing amount of semantic resources available online. As a result, there is no need either for a manually selected reference ontology (the relevant ontologies are dynamically selected from an online ontology repository), or for transforming background knowledge in an ontological form. The promising results from experiments on two real life thesauri indicate both that our approach has a high precision and also that it can find mappings, which are typically missed by existing approaches.
Ontology evolution aims at maintaining an ontology up to date with respect to changes in the domain that it models or novel requirements of information systems that it enables. The recent industrial adoption of Semantic Web techniques, which rely on ontologies, has led to the increased importance of the ontology evolution research. Typical approaches to ontology evolution are designed as multiple-stage processes combining techniques from a variety of fields (e.g., natural language processing and reasoning). However, the few existing surveys on this topic lack an in-depth analysis of the various stages of the ontology evolution process. This survey extends the literature by adopting a process-centric view of ontology evolution. Accordingly, we first provide an overall process model synthesized from an overview of the existing models in the literature. Then we survey the major approaches to each of the steps in this process and conclude on future challenges for techniques aiming to solve that particular stage.
Discovering links between overlapping datasets on the Web is generally realised through the use of fuzzy similarity measures. Configuring such measures is often a non-trivial task that depends on the domain, ontological schemas, and formatting conventions in data. Existing solutions either rely on the user's knowledge of the data and the domain or on the use of machine learning to discover these parameters based on training data. In this paper, we present a novel approach to tackle the issue of data linking which relies on the unsupervised discovery of the required similarity parameters. Instead of using labeled data, the method takes into account several desired properties which the distribution of output similarity values should satisfy. The method includes these features into a fitness criterion used in a genetic algorithm to establish similarity parameters that maximise the quality of the resulting linkset according to the considered properties. We show in experiments using benchmarks as well as real-world datasets that such an unsupervised method can reach the same levels of performance as manually engineered methods, and how the different parameters of the genetic algorithm and the fitness criterion affect the results for different datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.