Peer Data Management Systems (PDMSs) are advanced P2P applications in which each peer represents an autonomous data source making available an exported schema to be shared with other peers. Query answering in PDMSs can be improved if peers are efficiently disposed in the overlay network according to the similarity of their content. The set of peers can be partitioned into clusters, so as the semantic similarity among the peers participating into the same cluster is maximal. The creation and maintenance of clusters is a challenging problem in the current stage of development of PDMSs. This work proposes an incremental peer clustering process. The authors present a PDMS architecture designed to facilitate the connection of new peers according to their exported schema described by an ontology. The authors propose a clustering process and the underlying algorithm. The authors present and discuss some experimental results on peer clustering using the approach.
Multi-sources information systems, such as data warehouse systems, involve heterogeneous sources. In this paper, we deal with the semantic heterogeneity of the data instances. Problems may occur when confronting sources, each time different level of denominations have been used for the same value, e.g. "vermilion" in one source, and "red" in an other. We propose to manage this semantic heterogeneity by using a linguistic dictionary. "Semantic operators" allow a linguistic flexibility in the queries, e.g. two tuples with the values "red" and "vermilion" could match in a semantic join on the "color" attribute. A particularity of our approach is it states the scope of the flexibility by defining classes of equivalent values by the mean of "priority nodes". They are used as parameters for allowing the user to define the scope of the flexibility in a very natural manner, without specifying any distance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.