This work presents MIDAS-RDF, a distributed P2P RDF/S repository that is built on top of a distributed multi-dimensional index structure. MIDAS-RDF features fast retrieval of RDF triples satisfying various pattern queries by translating them into multi-dimensional range queries, which can be processed by the underlying index in hops logarithmic to the number of peers. More importantly, MIDAS-RDF utilizes a labeling scheme to handle expensive transitive closure computations efficiently. This allows for distributed RDFS reasoning in a more scalable way compared to existing methods, as also demonstrated by our extensive experimental study. Furthermore, MIDAS-RDF supports a publish-subscribe model that enables remote peers to selectively subscribe to RDF content.
Abstract. This work presents a pure multidimensional, indexing infrastructure for large-scale decentralized networks that operate in extremely dynamic environments where peers join, leave and fail arbitrarily. We propose a new peer-to-peer variant implementing a virtual distributed k-d tree, and develop efficient algorithms for multidimensional point and range queries. Scalability is enhanced as each peer has only partial knowledge of the network. The most prominent feature of our method, is that in expectance each peer maintains O(log n) state and requests are resolved in O(log n) hops with respect to the overlay size n. In addition, we provide mechanisms for handling peer failures and improving fault tolerance as well as balancing the load of peers. Finally, our work is complemented by an experimental evaluation, where MIDAS is shown to outperform existing methods in spatial as well as in higher dimensional settings.
In this paper, we propose efficient algorithms for result diversification over indexed multi-dimensional data. We develop algorithms under the prism of a centralized approach, as in a database. Specifically, we rely on widely used multi-dimensional indexes, like the Rtree. In principle, our schemes adopt a maximal marginal relevance (MMR) ranking strategy and leverage interchange and greedy diversification techniques. Hitherto, mostly combinatorial aspects of this problem have been considered which require scanning the entire data, and therefore, existing solutions are costly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.