Noise contrastive estimation: Asymptotic properties, formal comparison with MC-MLE

Riou-Durand, Lionel; Chopin, Nicolás

doi:10.1214/18-ejs1485

Cited by 5 publications

(3 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where where have used that for ν → ∞, m = νn becomes arbitrarily large too, so that the average over the y i becomes an expectation with respect to q(y). A more general result was established by Riou-Durand and Chopin (2018). Moreover, they further considered the case of finite ν, but large sample sizes n, and showed that the variance of the noise-contrastive estimator is always smaller than the variance of the Monte Carlo MLE estimator (Geyer, 1994) where the partition function is approximated with a sample average, assuming in both cases that the auxiliary/reference distributions were fixed.…”

Section: Estimating Energy-based Modelsmentioning

confidence: 90%

Statistical applications of contrastive learning

Gutmann¹,

Kleinegesse²,

Rhodes³

2022

Preprint

View full text Add to dashboard Cite

The likelihood function plays a crucial role in statistical inference and experimental design. However, it is computationally intractable for several important classes of statistical models, including energy-based models and simulatorbased models. Contrastive learning is an intuitive and computationally feasible alternative to likelihood-based learning. We here first provide an introduction to contrastive learning and then show how we can use it to derive methods for diverse statistical problems, namely parameter estimation for energy-based models, Bayesian inference for simulator-based models, as well as experimental design.

show abstract

Section: Estimating Energy-based Modelsmentioning

confidence: 90%

Statistical applications of contrastive learning

Gutmann¹,

Kleinegesse²,

Rhodes³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…, and taking the limit of → ∞ in (38), thus, gives where where have used that for → ∞ , m = n becomes arbitrarily large too, so that the average over the i becomes an expectation with respect to q( ) . A more general result was established by Riou-Durand and Chopin (2018). Moreover, they further considered the case of finite , but large sample sizes n, and showed that the variance of the noise-contrastive estimator is always smaller than the variance of the Monte Carlo MLE estimator (Geyer 1994) where the partition function is approximated with a sample average, assuming in both cases that the auxiliary/reference distributions were fixed.…”

Section: Estimating Energy-based Modelsmentioning

confidence: 91%

Statistical applications of contrastive learning

2022

View full text Add to dashboard Cite

The likelihood function plays a crucial role in statistical inference and experimental design. However, it is computationally intractable for several important classes of statistical models, including energy-based models and simulator-based models. Contrastive learning is an intuitive and computationally feasible alternative to likelihood-based learning. We here first provide an introduction to contrastive learning and then show how we can use it to derive methods for diverse statistical problems, namely parameter estimation for energy-based models, Bayesian inference for simulator-based models, as well as experimental design.

show abstract

“…With respect to a fixed Q, it remains an open question about what formally are the nature of the challenges posed by a poorly chosen Q, which could be statistical and/or algorithmic. Various previous works have analyzed the asymptotic behavior of NCE and its variants (Gutmann & Hyvärinen, 2012;Riou-Durand et al, 2018;Uehara et al, 2020), but these do not provide guidance on the finite sample behavior of NCE or its common variants. The improvements to NCE in prior works are all borne out by the empirical observations of NCE practitioners, rather than motivated by theory, which is precisely the aim of this work.…”

Section: Related Workmentioning

confidence: 99%

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

Liu¹,

Rosenfeld²,

Ravikumar³

et al. 2021

Preprint

View full text Add to dashboard Cite

Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models. It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance. However, such observations have never been made formal or quantitative. In fact, it is not even clear whether the difficulties arising from a poorly chosen noise distribution are statistical or algorithmic in nature. In this work, we formally pinpoint reasons for NCE's poor performance when an inappropriate noise distribution is used. Namely, we prove these challenges arise due to an ill-behaved (more precisely, flat) loss landscape. To address this, we introduce a variant of NCE called eNCE which uses an exponential loss and for which normalized gradient descent addresses the landscape issues provably when the target and noise distributions are in a given exponential family.

show abstract

Noise contrastive estimation: Asymptotic properties, formal comparison with MC-MLE

Cited by 5 publications

References 16 publications

Statistical applications of contrastive learning

Statistical applications of contrastive learning

Statistical applications of contrastive learning

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

Contact Info

Product

Resources

About