2019
DOI: 10.48550/arxiv.1911.10688
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(22 citation statements)
references
References 11 publications
0
21
0
1
Order By: Relevance
“…First, we compare the generalization performance of the proposed method against baselines by training classifiers on CIFAR-100 (Krizhevsky et al, 2009), Tiny-ImageNet (Chrabaszcz et al, 2017), ImageNet (Deng et al, 2009), and the Google commands speech dataset (Warden, 2017). Next, we test the localization performance of classifiers following the evaluation protocol of Qin and Kim (2019). We also measure calibration error (Guo et al, 2017) of classifiers to verify Co-Mixup successfully alleviates the over-confidence issue by Zhang et al (2018).…”
Section: Methodsmentioning
confidence: 99%
“…First, we compare the generalization performance of the proposed method against baselines by training classifiers on CIFAR-100 (Krizhevsky et al, 2009), Tiny-ImageNet (Chrabaszcz et al, 2017), ImageNet (Deng et al, 2009), and the Google commands speech dataset (Warden, 2017). Next, we test the localization performance of classifiers following the evaluation protocol of Qin and Kim (2019). We also measure calibration error (Guo et al, 2017) of classifiers to verify Co-Mixup successfully alleviates the over-confidence issue by Zhang et al (2018).…”
Section: Methodsmentioning
confidence: 99%
“…Setting the bounding box The output of g is map M in range 0 to 1, obtained by the sigmoid activation function. In order to derive a bounding box from this map, we follow the method of [6,7,29]. First, a threshold τ is calculated as…”
Section: Methods II (Siamese Network)mentioning
confidence: 99%
“…Many algorithms were proposed for the task of WSOL. The Class Activation Map (CAM) explainability method [51] and its variants [29] identify the salient pixels that lead to the classification. A multi-task loss function proposed by [24] takes shape into consideration.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…where Y are labels of target examples and Z = g(X) are features of them extracted by the pre-trained feature extractor g. Based on the theory in [20], Proposition 1 shows that TransRate provides an upper bound to the log-likelihood of the model h * • g. Detailed proofs can be found in Appendix C.…”
Section: Computation-efficient Transferability Estimationmentioning
confidence: 99%