We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.
We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.
This work considers the general task of estimating the sum of a bounded function over the edges of a graph that is unknown a priori, where graph vertices and edges are built on-the-fly by an algorithm and the resulting graph is too large to be kept in memory or disk. Prior work proposes Markov Chain Monte Carlo (MCMC) methods that simultaneously sample and generate the graph, eliminating the need for storage. Unfortunately, these existing methods are not scalable to massive real-world graphs. In this paper, we introduce Ripple, an MCMC-based estimator which achieves unprecedented scalability in this task by stratifying the MCMC Markov chain state space with a new technique that we denote ordered sequential stratified Markov regenerations. We show that the Ripple estimator is consistent, highly parallelizable, and scales well.In particular, applying Ripple to the task of estimating connected induced subgraph counts on large graphs, we empirically demonstrate that Ripple is accurate and is able to estimate counts of up to 12-node subgraphs, a task at a scale that has been considered unreachable, not only by prior MCMC-based methods, but also by other sampling approaches. For instance, in this target application, we present results where the Markov chain state space is as large as 10 43 , for which Ripple computes estimates in less than 4 hours on average.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.