Proceedings of the 14th International Conference on Extending Database Technology 2011
DOI: 10.1145/1951365.1951412
|View full text |Cite
|
Sign up to set email alerts
|

Synopses for probabilistic data over large domains

Abstract: Many real world applications produce data with uncertainties drawn from measurements over a continuous domain space. Recent research in the area of probabilistic databases has mainly focused on managing and querying discrete data in which the domain is limited to a small number of values (i.e. on the order of 10). When the size of the domain increases, current methods fail due to their nature of explicitly storing each value/probability pair. Such methods are not capable of extending their use to continuous-va… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…Single aggregate, however, can be used only for simple applications [4]. Thus, [17], [46] selected a small set of represented data by clustering the uncertain data as the summarization; clustering points reserved important information of the data, however, when they represented the whole datasets, the precision was low and the errors were not bounded; additionally, they were incapable of trivially inducing the distribution of the datasets. Obtaining the distribution of a dataset is a fundamental issue in traditional data management, such as query analysis or query optimization.…”
Section: B Summarization Of Uncertain Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Single aggregate, however, can be used only for simple applications [4]. Thus, [17], [46] selected a small set of represented data by clustering the uncertain data as the summarization; clustering points reserved important information of the data, however, when they represented the whole datasets, the precision was low and the errors were not bounded; additionally, they were incapable of trivially inducing the distribution of the datasets. Obtaining the distribution of a dataset is a fundamental issue in traditional data management, such as query analysis or query optimization.…”
Section: B Summarization Of Uncertain Datamentioning
confidence: 99%
“…Reference [5] summarized uncertain data streams by computing essential aggregates, but their methods did not apply to continuous domain either. To compute a synopses over continuous-valued datasets with uncertainty, [17] presented methods to find a representative set by clustering the datasets. This synopses, however, were capable of neither representing streaming data nor trivially inducing the distribution of the data.…”
Section: Introductionmentioning
confidence: 99%