Stability of Bayesian inference in exponential families

Boratyńska, Agata

doi:10.1016/s0167-7152(97)00060-6

Cited by 18 publications

(8 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Further, reinterpreting F * as the log normalizer of an exponential family distribution, we get the Dirichlet distribution, which is precisely the conjugate prior [55] of multinomial distributions used in prior-posterior Bayesian updating estimation procedures. We summarize the chain of duality as follows:…”

Section: B Revisiting the Centroid Of Symmetrized Kullback-leibler Dmentioning

confidence: 99%

Sided and Symmetrized Bregman Centroids

Nielsen

Nock²

2009

IEEE Trans. Inform. Theory

142

134

View full text Add to dashboard Cite

Abstract-We generalize the notions of centroids (and barycenters) to the broad class of information-theoretic distortion measures called Bregman divergences. Bregman divergences form a rich and versatile family of distances that unifies quadratic Euclidean distances with various well-known statistical entropic measures. Since besides the squared Euclidean distance, Bregman divergences are asymmetric, we consider the left-sided and rightsided centroids and the symmetrized centroids as minimizers of average Bregman distortions. We prove that all three centroids are unique and give closed-form solutions for the sided centroids that are generalized means. Furthermore, we design a provably fast and efficient arbitrary close approximation algorithm for the symmetrized centroid based on its exact geometric characterization. The geometric approximation algorithm requires only to walk on a geodesic linking the two left/right sided centroids. We report on our implementation for computing entropic centers of image histogram clusters and entropic centers of multivariate normal distributions that are useful operations for processing multimedia information and retrieval. These experiments illustrate that our generic methods compare favorably with former limited ad-hoc methods.Index Terms-Centroid, Kullback-Leibler divergence, Bregman divergence, Bregman power divergence, Burbea-Rao divergence, Csiszár divergence, Legendre duality, Information geometry.

show abstract

Section: B Revisiting the Centroid Of Symmetrized Kullback-leibler Dmentioning

confidence: 99%

Sided and Symmetrized Bregman Centroids

Nielsen

Nock²

2009

IEEE Trans. Inform. Theory

142

134

View full text Add to dashboard Cite

show abstract

“…Boratynska [25] defines a set of priors by specifying bounds for the product n 0 y 0 . However, since n 0 is kept constant, this is equivalent to define bounds for y 0 , as in the work of Quaeghebeur and De Cooman.…”

Section: Comparison With Other Models For Ignorancementioning

confidence: 99%

“…Thus, being δ > 0, if (25) holds and since w * is free to vary in R, we can always find a value of w * such that …”

mentioning

confidence: 99%

A model of prior ignorance for inferences in the one-parameter exponential family

Benavoli

Zaffalon

2012

Journal of Statistical Planning and Inference

View full text Add to dashboard Cite

This paper proposes a model of prior ignorance about a scalar variable based on a set of distributions M. In particular, a set of minimal properties that a set M of distributions should satisfy to be a model of prior ignorance without producing vacuous inferences is defined. In the case the likelihood model corresponds to a one-parameter exponential family of distributions, it is shown that the above minimal properties are equivalent to a special choice of the domains for the parameters of the conjugate exponential prior. This makes it possible to define the largest (that is, the least-committal) set of conjugate priors M that satisfies the above properties. The obtained set M is a model of prior ignorance with respect to the functions (queries) that are commonly used for statistical inferences; it is easy to elicit and, because of conjugacy, tractable; it encompasses frequentist and the so-called objective Bayesian inferences with improper priors. An application of the model to a problem of inference with count data is presented.

show abstract

“…Depending on the need to exploit one or the other of these distinguished properties, the Bregman distances or Csiszár divergences are preferred, and both of them are widely applied in important areas of information theory, statistics and computer science, for example in (Ai) information retrieval (see, e.g., Do and Vetterli (2002), Hertz at al. (2004)), (Aii) optimal decision (for general decision see, e.g., Boratynska (1997), Freund et al (1997), Bartlett et al (2006), Vajda and Zvárová (2007), for speech processing see, e.g., Carlson and Clements (1991), Veldhuis and Klabers (2002), for image processing see, e.g., Xu and Osher (2007), Marquina and Osher (2008), Scherzer et al (2008)), and (Aiii) machine learning (see, e.g., Laferty (1999), Banerjee et al (2005), Amari (2007), Teboulle (2007), Nock and Nielsen (2009)).…”

Section: Introductionmentioning

confidence: 99%

On Bregman Distances and Divergences of Probability Measures

Stummer¹,

Vajda

2012

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

The paper introduces scaled Bregman distances of probability distributions which admit non-uniform contributions of observed events. They are introduced in a general form covering not only the distances of discrete and continuous stochastic observations, but also the distances of random processes and signals. It is shown that the scaled Bregman distances extend not only the classical ones studied in the previous literature, but also the information divergence and the related wider class of convex divergences of probability measures. An information processing theorem is established too, but only in the sense of invariance w.r.t. statistically sufficient transformations and not in the sense of universal monotonicity. Pathological situations where coding can increase the classical Bregman distance are illustrated by a concrete example. In addition to the classical areas of application of the Bregman distances and convex divergences such as recognition, classification, learning and evaluation of proximity of various features and signals, the paper mentions a new application in 3D-exploratory data analysis. Explicit expressions for the scaled Bregman distances are obtained in general exponential families, with concrete applications in the binomial, Poisson and Rayleigh families, and in the families of exponential processes such as the Poisson and diffusion processes including the classical examples of the Wiener process and geometric Brownian motion.

show abstract

Stability of Bayesian inference in exponential families

Cited by 18 publications

References 4 publications

Sided and Symmetrized Bregman Centroids

Sided and Symmetrized Bregman Centroids

A model of prior ignorance for inferences in the one-parameter exponential family

On Bregman Distances and Divergences of Probability Measures

Contact Info

Product

Resources

About