SUMMARYTwo different types of representations, such as an image and its manually-assigned corresponding labels, generally have complex and strong relationships to each other. In this paper, we represent such deep relationships between two different types of visible variables using an energy-based probabilistic model, called a deep relational model (DRM) to improve the prediction accuracies. A DRM stacks several layers from one visible layer on to another visible layer, sandwiching several hidden layers between them. As with restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs), all connections (weights) between two adjacent layers are undirected. During maximum likelihood (ML) -based training, the network attempts to capture the latent complex relationships between two visible variables with its deep architecture. Unlike deep neural networks (DNNs), 1) the DRM is a totally generative model and 2) allows us to generate one visible variables given the other, and 2) the parameters can be optimized in a probabilistic manner. The DRM can be also finetuned using DNNs, like deep belief nets (DBNs) or DBMs pre-training. This paper presents experiments conduced to evaluate the performance of a DRM in image recognition and generation tasks using the MNIST data set. In the image recognition experiments, we observed that the DRM outperformed DNNs even without fine-tuning. In the image generation experiments, we obtained much more realistic images generated from the DRM more than those from the other generative models. [11], and a sum-product network (SPN) [12]. These models were mainly introduced to capture high-order abstractions for good representation of the observations, rather than for discriminative goal. Once obtaining highlevel abstractions, we can, for instance, remove some noise on the observations, or restore missing parts in the observations.Most of the existing deep-learning approaches focus on extracting high-order abstractions from one variable. In this paper, we try to capture such high-order relationships between two different types of variables based on deep learning. For that, we introduced a probabilistic model called a deep relational model (DRM) [13]. A DRM is similar to an RBM and a DBM, each of which is a probabilistic model based on an energy function. The model sandwiches several hidden layers * * between two visible layers and defines a joint probability for the two visible variables. Every two adjacent layers are connected with undirected weights, which are estimated so as to maximize the likelihood of the two visible variables. Interestingly, since the DRM is a totally generative model, it allows us not only to apply it to recognition tasks, but to also generate samples of one variable from the other variable. For example, considering that we have two kinds of variables for a hand-written digit image and a one-hot vector of the labels, we can estimate the label by inferring mean-field posteriors given an image (classification task). On the other hand, by inferring posteriors g...