1999
DOI: 10.1145/304181.304203
|View full text |Cite
|
Sign up to set email alerts
|

A comparison of selectivity estimators for range queries on metric attributes

Abstract: In this paper, we present a comparison of nonparametric estimation methods for computing approximations of the selectivities of queries, in particular range queries. In contrast to previous studies, the focus of our comparison is on metric attributes with large domains which occur for example in spatial and temporal databases. We also assume that only small sample sets of the required relations are available for estimating the selectivity. In addition to the popular histogram estimators, our comparison include… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2002
2002
2018
2018

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 32 publications
(24 citation statements)
references
References 16 publications
0
24
0
Order By: Relevance
“…There has also been an interesting effort that introduces the use of kernel estimation into the 1-dimensional histogram world [10] to deal specifically with real-valued data. Roughly, it suggests choosing the points of considerable change in the probability density function as the bucket boundaries (in a spirit similar to the maxdiff partition constraint) and then applying the traditional kernel estimation method for approximating the values within each bucket.…”
Section: Value Approximation Within Each Bucketmentioning
confidence: 99%
“…There has also been an interesting effort that introduces the use of kernel estimation into the 1-dimensional histogram world [10] to deal specifically with real-valued data. Roughly, it suggests choosing the points of considerable change in the probability density function as the bucket boundaries (in a spirit similar to the maxdiff partition constraint) and then applying the traditional kernel estimation method for approximating the values within each bucket.…”
Section: Value Approximation Within Each Bucketmentioning
confidence: 99%
“…One can prove that selection of a particular kernel function is not critical in most practical applications as all of them guarantee obtaining similar results (Wand and Jones, 1995). However, in some applications, like, e.g., query selectivity estimation (Blohsfeld et al, 1999), kernels with finite support may be more adequate. On the other hand, the bandwidth is the parameter which exhibits a strong influence on the resulting estimate.…”
Section: 1mentioning
confidence: 99%
“…Among the huge amount of applications of PDFs there are important ones related to databases and data exploration. They can be successfully used, for example, in Approximate Query Processing (AQP) (Gramacki et al, 2010), query selectivity estimation (Blohsfeld et al, 1999) and clustering (Kulczycki and Charytanowicz, 2010).…”
Section: Introductionmentioning
confidence: 99%
“…The reason for this phenomenon is the fact that the thresholding scheme employed in [3] minimizes the overall mean squared error 2 for each relation, which minimizes the error regarding range selection queries, but disregards accurate join estimation.…”
Section: Join Synopsesmentioning
confidence: 99%
“…This broad importance of statistics management has led to a plethora of approximation techniques, for which [11] have coined the general term "data synopses": advanced forms of histograms [24,12,16], spline synopses [18,19], sampling [5,13,10], and parametric curve-fitting techniques [27,7] all the way to highly sophisticated methods based on kernel estimators [2] or Wavelets and other transforms [22,21,3]. However, most of these techniques take the local viewpoint of optimizing the approximation error for a single data distribution such as one database table with preselected relevant attributes.…”
Section: Introductionmentioning
confidence: 99%