2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01145
|View full text |Cite
|
Sign up to set email alerts
|

Generating Accurate Pseudo-Labels in Semi-Supervised Learning and Avoiding Overconfident Predictions via Hermite Polynomial Activations

Abstract: Rectified Linear Units (ReLUs) are among the most widely used activation function in a broad variety of tasks in vision. Recent theoretical results suggest that despite their excellent practical performance, in various cases, a substitution with basis expansions (e.g., polynomials) can yield significant benefits from both the optimization and generalization perspective. Unfortunately, the existing results remain limited to networks with a couple of layers, and the practical viability of these results is not ye… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(14 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…This is a limitation of pseudo-label methods, and leads to considerable errors in the pseudo labels generated from a model with poor performance. Because these false predictions are treated as ground-truth labels for training, when the error rate in certain categories is too high, false pseudo-labels dominate during subsequent training iterations and contribute to a side effect in the model performance with respect to these categories, which provides overconfident predictions (Lokhande et al, 2020).…”
Section: Results and Analysismentioning
confidence: 99%
“…This is a limitation of pseudo-label methods, and leads to considerable errors in the pseudo labels generated from a model with poor performance. Because these false predictions are treated as ground-truth labels for training, when the error rate in certain categories is too high, false pseudo-labels dominate during subsequent training iterations and contribute to a side effect in the model performance with respect to these categories, which provides overconfident predictions (Lokhande et al, 2020).…”
Section: Results and Analysismentioning
confidence: 99%
“…The consistency loss is based on the idea that , where is an unlabeled image, and “ ” refers to performing horizontal mirroring. Lokhande et al [ 22 ] used self-training for deep image classification. In this case, the original activation functions of , a CNN for image classification, must be changed to Hermite polynomials.…”
Section: Related Workmentioning
confidence: 99%
“…In this case, the original activation functions of , a CNN for image classification, must be changed to Hermite polynomials. Note that these two examples of self-training involve modifications either in the architecture of [ 22 ] or in its training framework [ 21 ]. However, we aim at using a given together with its training framework as a black box, so performing SSL only at the data level.…”
Section: Related Workmentioning
confidence: 99%
“…The consistency loss is based on the idea that φ(I u ) ∼ φ(I u ) , where I u is an unlabeled image, and " " refers to performing horizontal mirroring. Lokhande et al [15] used self-training for deep image classification. In this case, the original activation functions of φ, a CNN for image classification, must be changed to Hermite polynomials.…”
Section: Related Workmentioning
confidence: 99%
“…In this case, the original activation functions of φ, a CNN for image classification, must be changed to Hermite polynomials. Note that these two examples of self-training involve modifications either in the architecture of φ [15] or in its training framework [12]. However, we aim at using a given φ together with its training framework as a black box, so performing SSL only at the data level.…”
Section: Related Workmentioning
confidence: 99%