Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data 2010
DOI: 10.1145/1807167.1807236
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchically organized skew-tolerant histograms for geographic data objects

Abstract: Histograms have been widely used for fast estimation of query result sizes in query optimization. In this paper, we propose a new histogram method, called the Skew-Tolerant Histogram (STHistogram) for two or three dimensional geographic data objects that are used in many real-world applications in practice.The proposed method provides a significantly enhanced accuracy in a robust manner even for the data set that has a highly skewed distribution. Our method detects hotspots present in various parts of a data s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2012
2012
2016
2016

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(13 citation statements)
references
References 33 publications
0
13
0
Order By: Relevance
“…We vary the number of histogram buckets from 50 to 250 like most other researchers do [3], [24], [27], [29].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We vary the number of histogram buckets from 50 to 250 like most other researchers do [3], [24], [27], [29].…”
Section: Methodsmentioning
confidence: 99%
“…A cardinality estimate which is close to the real cardinality enables the optimizer to accurately estimate the costs of different plans, and to choose a good plan. Therefore, the quality of a histogram is conventionally measured by the error the histogram produces over a series of queries [3], [12], [24], [27], [29]. Given a workload W and histogram H, the Mean Absolute Error is:…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our compression technique can be viewed as a bottom-up approach for building a histogram, as it proceeds by progressively aggregating pairs of tuples (starting from the original ones), and the final aggregate tuples can be viewed as buckets storing aggregate information on the original tuples merged into them. However, it is worth noting that the above-mentioned histogramconstruction techniques (as well as more recent proposals [9,16,15,17,23,25,32]) cannot be easily extended to deal with our setting, In fact, these techniques are guided only by the measure values associated with the points that will be aggregated into buckets, and do not take into account any precedence (temporal) relationship between points. This means that they are not able to construct a histogram from which the structure of the processes can be re-composed with no loss.…”
Section: Related Workmentioning
confidence: 99%
“…Once again, this method uses a rectangular grid as a starting point thus making it dependent on the initial grid resolution. STHist [31] applies the idea of GenHist to 2D and 3D spatial objects. In the basic algorithm, dense regions are determined by applying a sliding window over each dimension, approximating the frequency distribution with a marginal distribution.…”
Section: Related Workmentioning
confidence: 99%