2011
DOI: 10.1145/2000824.2000828
|View full text |Cite
|
Sign up to set email alerts
|

The monte carlo database system

Abstract: The application of stochastic models and analysis techniques to large datasets is now commonplace. Unfortunately, in practice this usually means extracting data from a database system into an external tool (such as SAS, R, Arena, or Matlab), and then running the analysis there. This extract-and-model paradigm is typically error-prone, slow, does not support fine-grained modeling, and discourages what-if and sensitivity analyses. In this article we describe MCDB, a database system that permits a wide … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 35 publications
(5 citation statements)
references
References 34 publications
0
5
0
Order By: Relevance
“…Unfortunately, generalizing from discrete to continuous distributions usually comes with substantial mathematical overhead. While several systems [2,24,35] handle continuous probability distributions, only recently [21,22], Grohe and Lindner proposed a general framework for rigorously dealing with probabilistic databases over continuous domains. Moreover, they establish basic properties such as the measurability of relational calculus and Datalog queries, which in turn allows for formally specifying the semantics of queries over continuous probabilistic databases.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Unfortunately, generalizing from discrete to continuous distributions usually comes with substantial mathematical overhead. While several systems [2,24,35] handle continuous probability distributions, only recently [21,22], Grohe and Lindner proposed a general framework for rigorously dealing with probabilistic databases over continuous domains. Moreover, they establish basic properties such as the measurability of relational calculus and Datalog queries, which in turn allows for formally specifying the semantics of queries over continuous probabilistic databases.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, we mention MCDB [24] and its successor SimSQL [6]. Here, users are able to specify probabilistic models in the shape of random database instances.…”
Section: Introductionmentioning
confidence: 99%
“…Cambronero et al [13] integrate probabilities into a relational database system to support imputation, while Hilprecht et al [39] use probabilistic circuits to improve query performance. Jampani et al [42] use probabilistic databases to support random data generation and simulation. Cai et al [12] provides Gibbs sampling support in the space of database tables to a SQL-like language, enabling bayesian machine learning workload such as linear regression or latent Dirichlet allocation.…”
Section: Related Workmentioning
confidence: 99%
“…For these models, this usually amounts to a program transformation [73], or e.g., a costly matrix inversion [49]. Likewise, for both exact and 179: 42 Huot, Ghavami, Lew, Schaechtle, Freer, Shelby, Rinard, Saad, Mansinghka random variables 𝑥 𝑛 converges to 𝑥, then for every continuous function 𝑓 , 𝑓 (𝑥 𝑛 ) converges to 𝑓 (𝑥).…”
Section: :30mentioning
confidence: 99%
“…In-database techniques are used for exact evaluation of tractable queries in tupleindependent probabilistic databases, e.g., using safe plans [6] as discussed below, and also for approximate evaluation of hard queries, e.g., computing lower and upper bounds on answer probabilities via dissociation of input probabilistic events [14] or running Monte Carlo simulations that aggregate the query answers over several possible worlds sampled from complex probabilistic models [20].…”
Section: In-database Techniquesmentioning
confidence: 99%