Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data 2013
DOI: 10.1145/2463676.2463702
|View full text |Cite
|
Sign up to set email alerts
|

Towards high-throughput gibbs sampling at scale

Abstract: Factor graphs and Gibbs sampling are a popular combination for Bayesian statistical methods that are used to solve diverse problems including insurance risk models, pricing models, and information extraction. Given a fixed sampling method and a fixed amount of time, an implementation of a sampler that achieves a higher throughput of samples will achieve a higher quality than a lower-throughput sampler. We study how (and whether) traditional data processing choices about materialization, page layout, and buffer… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0
1

Year Published

2014
2014
2019
2019

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 42 publications
(36 citation statements)
references
References 37 publications
0
35
0
1
Order By: Relevance
“…DimmWitted [55], the statistical inference and learning engine in DeepDive , is built upon our research of how to design a high-performance statistical inference and learning engine on a single machine [29,41,54,55]. DimmWitted models Gibbs sampling as a “column-to-row access” operation: each row corresponds to one factor, each column to one variable, and the non-zero elements in the matrix correspond to edges in the factor graph.…”
Section: System Infrastructurementioning
confidence: 99%
See 1 more Smart Citation
“…DimmWitted [55], the statistical inference and learning engine in DeepDive , is built upon our research of how to design a high-performance statistical inference and learning engine on a single machine [29,41,54,55]. DimmWitted models Gibbs sampling as a “column-to-row access” operation: each row corresponds to one factor, each column to one variable, and the non-zero elements in the matrix correspond to edges in the factor graph.…”
Section: System Infrastructurementioning
confidence: 99%
“…DeepDive 's model of KBC is motivated by the recent attempts of using machine learning-based technique for KBC [3,4,24,38,46,52,56] and the line of research that aims to improve the quality of a specific component of KBC system [7,12,15,21,26,27,31–33,35,39,42,47,48,51,53,54]. When designing DeepDive , we used these systems as test cases to justify the generality of our framework.…”
Section: Related Workmentioning
confidence: 99%
“…Each tuple (I1, I2, I3, w) represents a weighted ground rule I1 ← I2, I3. I1 is the head ; I2, I3 are the body and allowed to be NULL for factors of sizes 1 or 2. can be input to probabilistic inference engines, e.g., [29,56]. Moreover, since it records the causal relationships among facts, it contains the entire lineage and can be queried [52].…”
Section: Factor Graphsmentioning
confidence: 99%
“…During grounding, the database optimizes and executes the stored procedures and generates a factor graph in relational format. Existing inference engines, e.g., Gibbs [56], GraphLab [29], can be used to perform probabilistic inference over the result factor graph. …”
Section: Introductionmentioning
confidence: 99%
“…Essentially, every tuple in the database or result of a query is a random variable (node) in this factor graph. The inference phase takes the factor graph from grounding and performs statistical inference using standard techniques, e.g., Gibbs sampling [42,44]. The output of inference is the marginal probability of every tuple in the database.…”
Section: Introductionmentioning
confidence: 99%