Abstract-We consider the problem of reconstructing a signal from multi-layered (possibly) non-linear measurements. Using non-rigorous but standard methods from statistical physics we present the Multi-Layer Approximate Message Passing (ML-AMP) algorithm for computing marginal probabilities of the corresponding estimation problem and derive the associated state evolution equations to analyze its performance. We also give the expression of the asymptotic free energy and the minimal information-theoretically achievable reconstruction error. Finally, we present some applications of this measurement model for compressed sensing and perceptron learning with structured matrices/patterns, and for a simple model of estimation of latent variables in an auto-encoder.In many natural and engineered systems, the interactions between sets of variables in different subsystems involve multiple layers of interdependencies. This is for instance the case in the neural networks developed in deep learning [1], the hierarchical models used in statistical inference [2], and the multiplex networks considered in complex systems [3]. It is therefore fundamental to generalize our theoretical and algorithmic tools to deal with these multi-layer setups. Our goal in this paper is to develop such a generalization of the cavity/replica approach that originated in statistical physics [4] and that has been shown to be quite successful for studying generalized linear estimation with randomly chosen mixing, leading in particular to the computation of the mutual information (or equivalently free energy) and minimum achievable mean-squared error for CDMA and compressed sensing [5]-[8]. This methodology is also closely related to the approximate message passing (AMP) algorithm, originally known in physics as ThoulessAnderson-Palmer (TAP) equations [9]-[14].We present in section I a multi-layer generalized linear measurement (ML-GLM) model with random weights at each layer and consider the Bayesian inference of the signal measured by the ML-GLM. We derive AMP for ML-GLM and, using non-rigorous but standard methods from statistical physics, analyze its behavior by deriving the associated state evolution. We also present the expression for the associated free energy (or mutual information) and the optimal informationtheoretically mean squared error (MMSE). We compare the MMSE with the MSE achieved by AMP and describe the associated phase transitions.
We examine a class of stochastic deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) we show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive.
We consider the variational free energy approach for compressed sensing. We first show that the naïve mean field approach performs remarkably well when coupled with a noise learning procedure. We also notice that it leads to the same equations as those used for iterative thresholding. We then discuss the Bethe free energy and how it corresponds to the fixed points of the approximate message passing algorithm. In both cases, we test numerically the direct optimization of the free energies as a converging sparse-estimation algorithm.
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fullyvisible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.