Gaussian-Bernoulli deep Boltzmann machine

Cho, Kyung Hyun; Raiko, Tapani; Ilin, Alexander

doi:10.1109/ijcnn.2013.6706831

Cited by 94 publications

(56 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We can also extend † the DRM so that it feeds realvalued data for x and/or y using the Gaussian scheme like Gaussian-Bernoulli RBM [22] or Gaussian-Bernoulli DBM [23]. In this scheme, when we want to feed realvalued x ∈ R I , we replace the x-related terms in Eq.…”

Section: Definition and Generative Proceduresmentioning

confidence: 99%

Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional Estimation of Image and Labels

Nakashika

2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYTwo different types of representations, such as an image and its manually-assigned corresponding labels, generally have complex and strong relationships to each other. In this paper, we represent such deep relationships between two different types of visible variables using an energy-based probabilistic model, called a deep relational model (DRM) to improve the prediction accuracies. A DRM stacks several layers from one visible layer on to another visible layer, sandwiching several hidden layers between them. As with restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs), all connections (weights) between two adjacent layers are undirected. During maximum likelihood (ML) -based training, the network attempts to capture the latent complex relationships between two visible variables with its deep architecture. Unlike deep neural networks (DNNs), 1) the DRM is a totally generative model and 2) allows us to generate one visible variables given the other, and 2) the parameters can be optimized in a probabilistic manner. The DRM can be also finetuned using DNNs, like deep belief nets (DBNs) or DBMs pre-training. This paper presents experiments conduced to evaluate the performance of a DRM in image recognition and generation tasks using the MNIST data set. In the image recognition experiments, we observed that the DRM outperformed DNNs even without fine-tuning. In the image generation experiments, we obtained much more realistic images generated from the DRM more than those from the other generative models. [11], and a sum-product network (SPN) [12]. These models were mainly introduced to capture high-order abstractions for good representation of the observations, rather than for discriminative goal. Once obtaining highlevel abstractions, we can, for instance, remove some noise on the observations, or restore missing parts in the observations.Most of the existing deep-learning approaches focus on extracting high-order abstractions from one variable. In this paper, we try to capture such high-order relationships between two different types of variables based on deep learning. For that, we introduced a probabilistic model called a deep relational model (DRM) [13]. A DRM is similar to an RBM and a DBM, each of which is a probabilistic model based on an energy function. The model sandwiches several hidden layers * * between two visible layers and defines a joint probability for the two visible variables. Every two adjacent layers are connected with undirected weights, which are estimated so as to maximize the likelihood of the two visible variables. Interestingly, since the DRM is a totally generative model, it allows us not only to apply it to recognition tasks, but to also generate samples of one variable from the other variable. For example, considering that we have two kinds of variables for a hand-written digit image and a one-hot vector of the labels, we can estimate the label by inferring mean-field posteriors given an image (classification task). On the other hand, by inferring posteriors g...

show abstract

Section: Definition and Generative Proceduresmentioning

confidence: 99%

Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional Estimation of Image and Labels

Nakashika

2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…It has, however, been known and will be shown in the experiments in this paper that training a DBM using this approach starting from randomly initialized parameters is not trivial [36,15,11]. The difficulty of training without any pretraining was illustrated in [36] and [15] by a lower log-likelihood achieved by a DBM trained without any pretraining.…”

Section: Training Deep Boltzmann Machinesmentioning

confidence: 99%

“…The difficulty of training without any pretraining was illustrated in [36] and [15] by a lower log-likelihood achieved by a DBM trained without any pretraining. Furthermore, the lack of proper initialization of the parameters was found to result in the upper-level hidden neurons not being able to capture any interesting features of an input data in [11].…”

Section: Training Deep Boltzmann Machinesmentioning

confidence: 99%

How to Pretrain Deep Boltzmann Machines in Two Stages

Cho

Raiko

Ilin

et al. 2015

Springer Series in Bio-/Neuroinformatics

Self Cite

View full text Add to dashboard Cite

A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum-likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.

show abstract

“…Even though an efficient learning algorithm was proposed for GRBM [7], training is still very sensitive to initialization and choice of learning parameters. Cho et al proposed an enhanced gradient learning algorithm for GRBM in [2]. Throughout the paper, a modified version of GRBM [3] is adopted, where the energy function is defined as…”

Section: Gaussian Restricted Boltzmann Machinesmentioning

confidence: 99%

“…ik and v (2) ik are additional parameters to model the pair-wise connections between two sets of visible neurons {x, y} and hidden neurons h. Instead of looking for the image transformation, we seek for the internal structure of texture information. Therefore, the same patch of image is fed to the two sets of visible neurons, that is x = y.…”

Section: Gaussian Gated Boltzmann Machinementioning

confidence: 99%

Gated Boltzmann Machine in Texture Modeling

Hao

Raiko

Ilin

et al. 2012

Artificial Neural Networks and Machine Learning – ICANN 2012

Self Cite

View full text Add to dashboard Cite

Abstract. In this paper, we consider the problem of modeling complex texture information using undirected probabilistic graphical models. Texture is a special type of data that one can better understand by considering its local structure. For that purpose, we propose a convolutional variant of the Gaussian gated Boltzmann machine (GGBM) [12], inspired by the co-occurrence matrix in traditional texture analysis. We also link the proposed model to a much simpler Gaussian restricted Boltzmann machine where convolutional features are computed as a preprocessing step. The usefulness of the model is illustrated in texture classification and reconstruction experiments.

show abstract

Gaussian-Bernoulli deep Boltzmann machine

Cited by 94 publications

References 13 publications

Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional Estimation of Image and Labels

Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional Estimation of Image and Labels

How to Pretrain Deep Boltzmann Machines in Two Stages

Gated Boltzmann Machine in Texture Modeling

Contact Info

Product

Resources

About