2019
DOI: 10.48550/arxiv.1912.03263
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

Abstract: We propose to reinterpret a standard discriminative classifier of p(y|x) as an energy based model for the joint distribution p(x, y). In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y). Within this framework, standard discriminative architectures may be used and the model can also be trained on unlabeled data. We demonstrate that energy based training of the joint distribution improves calibration, robustness, and out-of-distribution detec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
150
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 71 publications
(150 citation statements)
references
References 22 publications
0
150
0
Order By: Relevance
“…Energy Based Models. Our work is related to existing work on energy-based models [5,7,9,11,13,23,34,43,46]. Most similar to our work is that of [8], which proposes a framework of utilizing EBMs to compose several object descriptions together.…”
Section: Generated Imagementioning
confidence: 90%
“…Energy Based Models. Our work is related to existing work on energy-based models [5,7,9,11,13,23,34,43,46]. Most similar to our work is that of [8], which proposes a framework of utilizing EBMs to compose several object descriptions together.…”
Section: Generated Imagementioning
confidence: 90%
“…Most recent work in energy-based models (EBM) [22] focused on the application of image generative modeling. Neural networks were trained to assign low energy to real samples [12,15,26,31,38] so that realistic-looking samples can be sampled from the low-energy regions of the EBM's energy landscape. Instead of synthesizing new samples, our goal here is to predict RNA splicing outcomes.…”
Section: Energy-based Modelsmentioning
confidence: 99%
“…(Using an uniform discrete distribution as 𝑞 where all 𝑘 possible states (𝑥 𝑖 ) have the same probability 𝑞(𝑥 𝑖 ) = (1/𝑘), we get Eq. (11) and(15), this gives𝑝 𝜃 (𝑥 𝑖 ) = ℎ(𝑥 𝑖 ) 𝑗 ℎ(𝑥 𝑗 ) = exp (−𝐸 𝜃 (𝑥 𝑖 )) 𝑗 exp (−𝐸 𝜃 (𝑥 𝑗 )) = Softmax 𝑖 (𝐸)(16)□…”
mentioning
confidence: 99%
“…Our work draws on recent work in energy based models (EBMs) [12,14,19,21,34,43,47,53]. Our underlying energy optimization procedure to generate samples is reminiscent of Langevin sampling, which is used to sample from EBMs [12,43,53].…”
Section: Related Workmentioning
confidence: 99%