Durable top-k search in document archives

U, Leong Hou; Mamoulis, Nikos; Berberich, Klaus; Bedathur, Srikanta

doi:10.1145/1807167.1807228

Cited by 21 publications

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections

Huo

Tsotras

2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. As the web evolves over time, the amount of versioned text collections increases rapidly. Most web search engines will answer a query by ranking all known documents at the (current) time the query is posed. There are applications however (for example customer behavior analysis, crime investigation, etc.) that would need to efficiently query these sources as of some past time, that is, retrieve the results as if the user was posing the query in a past time instant, thus accessing data known as of that time. Ranking and searching over versioned documents considers not only keyword constraints but also the time dimension, most commonly, a time point or time range of interest. In this paper, we deal with top-k query evaluations with both keyword and temporal constraints over versioned textual documents. In addition to considering previous solutions, we propose novel data organization and indexing solutions: the first one partitions data along ranking positions, while the other maintains the full ranking order through the use of a multiversion ordered list. We present an experimental comparison for both time point and time interval constraints. For time-interval constraints, different querying definitions, such as aggregation functions and consistent top-k queries are evaluated. Experimental evaluations on large real world datasets demonstrate the advantages of the newly proposed data organization and indexing approaches. If a text collection does not retain past documents, then a search query ranks only the documents as of the most current time. If the collection contains versioned documents, a search typically considers each version of a document as a separate document and the ranking is taken over all documents independently to the document's version (creation time). There are applications however, where this approach is not Introduction

show abstract

A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections

Huo

Tsotras

2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

Discovering Influential Data Objects over Time

Gkorgkas

Vlachou

Doulkeridis

et al. 2013

Advances in Spatial and Temporal Databases

View full text Add to dashboard Cite

Abstract. In applications such as market analysis, it is of great interest to product manufacturers to have their products ranked as highly as possible for a significant number of customers. However, customer preferences change over time, and product manufacturers are interested in monitoring the evolution of the popularity of their products, in order to discover those products that are consistently highly ranked. To take into account the temporal dimension, we define the continuous influential query and present algorithms for efficient processing and retrieval of continuous influential data objects. Furthermore, our algorithms support incremental retrieval of the next continuous influential data object in a natural way. To evaluate the performance of our algorithms, we conduct a detailed experimental study for various setups.

show abstract

Privacy-Preserving Top-k Query Processing in Distributed Systems

Mahboubi

Akbarinia

Valduriez

2019

Transactions on Large-Scale Data- And Knowledge-Centered Systems XLII

View full text Add to dashboard Cite

We consider a distributed system that stores user sensitive data across multiple nodes. In this context, we address the problem of privacy-preserving top-k query processing. We propose a novel system, called SD-TOPK, which is able to evaluate top-k queries over encrypted distributed data without needing to decrypt the data in the nodes where they are stored. We implemented and evaluated our system over synthetic and real databases. The results show excellent performance for SD-TOPK compared to baseline approaches.

show abstract

Durable top-k search in document archives

Cited by 21 publications

References 25 publications

A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections

A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections

Discovering Influential Data Objects over Time

Privacy-Preserving Top-k Query Processing in Distributed Systems

Contact Info

Product

Resources

About