<b>sgmcmc</b>: An <i>R</i> Package for Stochastic Gradient Markov Chain Monte Carlo

Baker, Jack W.; Fearnhead, Paul; Fox, Emily B.; Nemeth, Christopher

doi:10.18637/jss.v091.i03

Cited by 10 publications

(16 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This simple argument suggests that, for the same level of accuracy, we can reduce the computational cost of SGLD by O(N) if we use control variates. This is supported by a number of theoretical results (e.g., Nagapetyan et al 2017;Brosse, Durmus, and Moulines 2018;Baker et al 2019a) which show that, if we ignore the preprocessing cost of findingθ , the computational cost per-effective sample size of SGLD with control variates has a computational cost that is O(1), rather than the O(N) for SGLD with the simple gradient estimator (4).…”

Section: Estimating the Gradientmentioning

confidence: 64%

“…The intuition behind this idea is that if each u i (θ) ≈ ∇U i (θ ), then this estimator can have a much smaller variance. Recent works-for example, Baker et al (2019a) and Huggins and Zou (2017) (see Bardenet, Doucet, and Holmes 2017;Bierkens, Fearnhead, and Roberts 2019;Pollock et al 2020, for similar ideas used in different Monte Carlo procedures)-have implemented this control variate technique with each u i (θ ) set as a constant. These approaches propose (i) using SGD to find an approximation to the mode of the distribution we are sampling from, which we denote asθ ; and (ii) set u i (θ ) = ∇U i (θ ).…”

Section: Estimating the Gradientmentioning

confidence: 99%

“…These results give theoretical justification for using SGLD, and show we can sample from an arbitrarily good approximation to our posterior distribution if we choose K large enough, and h small enough. They have also been used to show the benefits of using control variates when estimating the gradient, which results in a computational cost that is O(1), rather than O(N), per effective sample size (Chatterji et al 2018;Baker et al 2019a). Perhaps the main benefit of results such as ( 6) is that they enable us to compare the properties of the different variants of SGLD that we will introduce in Section 3, and in particular how different algorithms scale with dimension, d (see Section 3 for details).…”

Section: Theory For Sgldmentioning

confidence: 99%

“…A drawback of these algorithms is that, while producing consistent estimates (Teh, Thiery, and Vollmer 2016), they converge at a slower rate than traditional MCMC algorithms. In recent years, SGMCMC algorithms have become a popular tool for scalable Bayesian inference, particularly in the machine learning community, and there have been numerous methodological (Chen, Fox, and Guestrin 2014;Ma, Chen, and Fox 2015;Dubey et al 2016;Baker et al 2019a) and theoretical developments (Teh, Thiery, and Vollmer 2016;Vollmer, Zygalakis, and Teh 2016;Durmus and Moulines 2017;Dalalyan and Karagulyan 2019) along with new application areas for these algorithms (Balan et al 2015;Gan et al 2015;Wang, Fienberg, and Smola 2015). This article presents a review of some of the key developments in SGMCMC and highlights some of the opportunities for future research.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Stochastic Gradient Markov Chain Monte Carlo

Nemeth

Fearnhead

2021

Journal of the American Statistical Association

Self Cite

View full text Add to dashboard Cite

Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large datasets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this article, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilizes data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online at https:// github.com/chris-nemeth/sgmcmc-review-paper.

show abstract

Section: Estimating the Gradientmentioning

confidence: 64%

Section: Estimating the Gradientmentioning

confidence: 99%

Section: Theory For Sgldmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Stochastic Gradient Markov Chain Monte Carlo

Nemeth

Fearnhead

2021

Journal of the American Statistical Association

Self Cite

View full text Add to dashboard Cite

show abstract

“…There is a variety of software available implementing these methods (Tran et al, 2016;Baker et al, 2016). In particular Baker et al (2016) implements the control variate methodology we discuss in this article. This paper investigates stochastic gradient Langevin dynamics (SGLD), a popular SGMCMC algorithm that discretises the Langevin diffusion.…”

Section: Introductionmentioning

confidence: 99%

Control variates for stochastic gradient MCMC

et al. 2018

Self Cite

View full text Add to dashboard Cite

It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC (SGMCMC). These methods use a noisy estimate of the gradient of the log-posterior, which reduces the per iteration computational cost of the algorithm. Despite this, there are a number of results suggesting that stochastic gradient Langevin dynamics (SGLD), probably the most popular of these methods, still has computational cost proportional to the dataset size. We suggest an alternative log-posterior gradient estimate for stochastic gradient MCMC which uses control variates to reduce the variance. We analyse SGLD using this gradient estimate, and show that, under log-concavity assumptions on the target distribution, the computational cost required for a given level of accuracy is independent of the dataset size. Next, we show that a different control-variate technique, known as zero variance control variates, can be applied to SGMCMC algorithms for free. This postprocessing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log-posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.

show abstract

SwISS: A scalable Markov chain Monte Carlo divide‐and‐conquer strategy

Callum

Nemeth

Sherlock

2023

Stat

Self Cite

View full text Add to dashboard Cite

Divide-and-conquer strategies for Monte Carlo algorithms are an increasingly popular approach to making Bayesian inference scalable to large data sets. In its simplest form, the data are partitioned across multiple computing cores and a separate Markov chain Monte Carlo algorithm on each core targets the associated partial posterior distribution, which we refer to as a sub-posterior, that is the posterior given only the data from the segment of the partition associated with that core. Divide-and-conquer techniques reduce computational, memory and disk bottle necks, but make it difficult to recombine the sub-posterior samples.We propose SwISS: Sub-posteriors with Inflation, Scaling and Shifting; a new approach for recombining the sub-posterior samples which is simple to apply, scales to high-dimensional parameter spaces and accurately approximates the original posterior distribution through affine transformations of the sub-posterior samples. We prove that our transformation is asymptotically optimal across a natural set of affine transformations and illustrate the efficacy of SwISS against competing algorithms on synthetic and real-world data sets.

show abstract

sgmcmc: An R Package for Stochastic Gradient Markov Chain Monte Carlo

Cited by 10 publications

References 6 publications

Stochastic Gradient Markov Chain Monte Carlo

Stochastic Gradient Markov Chain Monte Carlo

Control variates for stochastic gradient MCMC

SwISS: A scalable Markov chain Monte Carlo divide‐and‐conquer strategy

Contact Info

Product

Resources

About