Proceedings 2004 VLDB Conference 2004
DOI: 10.1016/b978-012088469-8.50058-9
|View full text |Cite
|
Sign up to set email alerts
|

Top-k Query Evaluation with Probabilistic Guarantees

Abstract: Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algorithm for evaluating top-k queries is Fagin's threshold algorithm (TA). Since the user's goal behind top-k queries is to identify one or a few relevant and novel data items, it is intriguing to use approximate variants of TA to reduce run-time costs. This paper introduces a family of approximate top-k algorithms based on probabilistic … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
80
0

Year Published

2005
2005
2020
2020

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 122 publications
(80 citation statements)
references
References 22 publications
0
80
0
Order By: Relevance
“…The idea of pruning has been pursued by approximate top-k selection [18] approaches. However, we do not approximate, but only prune those partial results that are guaranteed not to be part of the final top-k results.…”
Section: Early Pruning Of Partial Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The idea of pruning has been pursued by approximate top-k selection [18] approaches. However, we do not approximate, but only prune those partial results that are guaranteed not to be part of the final top-k results.…”
Section: Early Pruning Of Partial Resultsmentioning
confidence: 99%
“…However, [18] addressed the selection top-k problem, which is different to our top-k join problem. More importantly, we do not rely on probabilistic estimates for pruning, but employ accurate upper bounds.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, E -Upper can be time-consuming under some scenarios where we need to process a relatively large number of documents to lower the upperbound of a tuple that does not belong to the answer. This observation has been made in the topk processing case as well, leading to developing probabilistic algorithms [20]. Unfortunately, existing probabilistic algorithms typically require some apriori knowledge about the score distribution of the tuples.…”
Section: A E-upper Algorithmmentioning
confidence: 99%
“…In this paper, we adapted the generic Upper algorithm to our setting as discussed in Section IV-A, and we established the feasibility of our adaptation at processing good(k, ) queries, as discussed in Section VI. For processing top-k algorithms, a variety of probabilistic algorithms have also been explored [20], which exploit some a priori knowledge on the score distribution.…”
Section: Related Workmentioning
confidence: 99%