Variational Learning and Bits-Back Coding: An Information-Theoretic View to Bayesian Learning

Honkela, Antti; Valpola, Harri

doi:10.1109/tnn.2004.828762

Cited by 55 publications

(46 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For comparison we use the RBC via type-2 Student mixture model developed by Archambeau et al [5], in which the full Bayesian treatment allows for model selection, via the log evidence bound. As somewhat expected on the basis of known relationships between Bayesian and information-theoretic model selection [23], we see the two methods behave similarly indeed, and in most cases they both pick the true number of clusters. In the error-free case we can observe that the inaccuracy incurred by the speedup is negligible, while as the error level increases the advantage of our model's ability of taking into account these errors becomes apparent despite the speedup.…”

Section: E Assessment Of the Proposed Procedures For Determining The supporting

confidence: 75%

“…The optimal number can be automatically determined either by a Bayesian approach, such as in [5], [8], [43], [46] or based on information theory, such as minimum message length (MML) [50], minimum description length [39], Bayesian information criterion [40], Akaike information criterion [1], or by cross validation [19]. Among these methods, the Bayesian approach is currently most popular, and there are well known connections between the Bayesian approach and information theoretic ones [23].…”

Section: G Determining the Number Of Mixture Componentsmentioning

confidence: 99%

See 1 more Smart Citation

A Fast Algorithm for Robust Mixtures in the Presence of Measurement Errors

Sun

Kabán

2010

IEEE Trans. Neural Netw.

View full text Add to dashboard Cite

Abstract-In experimental and observational sciences, detecting atypical, peculiar data from large sets of measurements has the potential of highlighting candidates of interesting new types of objects that deserve more detailed domain-specific followup study. However, measurement data is nearly never free of measurement errors. These errors can generate false outliers that are not truly interesting. Although many approaches exist for finding outliers, they have no means to tell to what extent the peculiarity is not simply due to measurement errors. To address this issue, we have developed a model-based approach to infer genuine outliers from multivariate data sets when measurement error information is available. This is based on a probabilistic mixture of hierarchical density models, in which parameter estimation is made feasible by a tree-structured variational expectationmaximization algorithm. Here, we further develop an algorithmic enhancement to address the scalability of this approach, in order to make it applicable to large data sets, via a K-dimensionaltree based partitioning of the variational posterior assignments. This creates a non-trivial tradeoff between a more detailed noise model to enhance the detection accuracy, and the coarsened posterior representation to obtain computational speedup. Hence, we conduct extensive experimental validation to study the accuracy/speed tradeoffs achievable in a variety of data conditions. We find that, at low-to-moderate error levels, a speedup factor that is at least linear in the number of data points can be achieved without significantly sacrificing the detection accuracy. The benefits of including measurement error information into the modeling is evident in all situations, and the gain roughly recovers the loss incurred by the speedup procedure in large error conditions. We analyze and discuss in detail the characteristics of our algorithm based on results obtained on appropriately designed synthetic data experiments, and we also demonstrate its working in a real application example.Index Terms-K-dimensional (KD)-tree, measurement errors, outlier detection, robust mixture modeling, variational expectation-maximization (EM) algorithm.

show abstract

Section: E Assessment Of the Proposed Procedures For Determining The supporting

confidence: 75%

Section: G Determining the Number Of Mixture Componentsmentioning

confidence: 99%

A Fast Algorithm for Robust Mixtures in the Presence of Measurement Errors

Sun

Kabán

2010

IEEE Trans. Neural Netw.

View full text Add to dashboard Cite

show abstract

“…This provides an alternative justification for the variational method. Additionally, the alternative interpretation can provide more intuitive explanations on why some models provide higher mar- ginal likelihoods than others [22]. For the remainder of this paper, the optimization criterion will be the cost function (6) that is to be minimized.…”

Section: B Variational Bayesian Learningmentioning

confidence: 99%

Compact Modeling of Data Using Independent Variable Group Analysis

Alhoniemi

Honkela²,

Lagus³

et al. 2007

IEEE Trans. Neural Netw.

Self Cite

View full text Add to dashboard Cite

In this paper, we introduce a modeling approach called independent variable group analysis (IVGA) which can be used for finding an efficient structural representation for a given data set. The basic idea is to determine such a grouping for the variables of the data set that mutually dependent variables are grouped together whereas mutually independent or weakly dependent variables end up in separate groups. Computation of an IVGA model requires a combinatorial algorithm for grouping of the variables and a modeling algorithm for the groups. In order to be able to compare different groupings, a cost function which reflects the quality of a grouping is also required. Such a cost function can be derived, for example, using the variational Bayesian approach, which is employed in our study. This approach is also shown to be approximately equivalent to minimizing the mutual information between the groups. The modeling task is computationally demanding. We describe an efficient heuristic grouping algorithm for the variables and derive a computationally light nonlinear mixture model for modeling of the dependencies within the groups. Finally, we carry out a set of experiments which indicate that IVGA may turn out to be beneficial in many different applications.Index Terms-Compact modeling, independent variable group analysis (IVGA), mutual information, variable grouping, variational Bayesian learning.

show abstract

“…6.1, and a measure of the amount of independent innovation in the hidden nodes, the latter of which can be influenced by introducing the evidence nodes. More detailed discussion is presented in [45]. In addition to restricting the innovations, the incoming weights A of the hidden nodes are initialised to random values by evidence nodes with variance σ 2 = 10 −2 and life time of 40 iterations, when new nodes are added.…”

Section: Addition Of Hidden Nodesmentioning

confidence: 99%

Blind separation of nonlinear mixtures by variational Bayesian learning

Honkela¹,

Valpola²,

Ilin³

et al. 2007

Digital Signal Processing

View full text Add to dashboard Cite

Blind separation of sources from nonlinear mixtures is a challenging and often illposed problem. We present three methods for solving this problem: an improved nonlinear factor analysis (NFA) method using multilayer perceptron (MLP) network to model the nonlinearity, a hierarchical NFA (HNFA) method suitable for larger problems and a post-nonlinear NFA (PNFA) method for more restricted post-nonlinear mixtures. The methods are based on variational Bayesian learning, which provides the needed regularisation and allows for easy handling of missing data. While the basic methods are incapable of recovering the correct rotation of the source space, they can discover the underlying nonlinear manifold and allow reconstruction of the original sources using standard linear independent component analysis (ICA) techniques.

show abstract

Variational Learning and Bits-Back Coding: An Information-Theoretic View to Bayesian Learning

Cited by 55 publications

References 36 publications

A Fast Algorithm for Robust Mixtures in the Presence of Measurement Errors

A Fast Algorithm for Robust Mixtures in the Presence of Measurement Errors

Compact Modeling of Data Using Independent Variable Group Analysis

Blind separation of nonlinear mixtures by variational Bayesian learning

Contact Info

Product

Resources

About