2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00695
|View full text |Cite
|
Sign up to set email alerts
|

Regularizing Neural Networks via Minimizing Hyperspherical Energy

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
2

Relationship

2
8

Authors

Journals

citations
Cited by 16 publications
(21 citation statements)
references
References 26 publications
0
21
0
Order By: Relevance
“…They proposed several orthogonal weight normalization methods to solve optimization over multiple dependent Stiefel manifolds. MHH-based methods (Liu et al 2018;Lin et al 2020) are inspired by the Thomson problem in physics and define the hyperspherical energy to characterize the diversity on a unit hypersphere and shows significant and consistent improvement in supervised learning tasks. Since the orthogonal regularization is too limiting (Miyato et al 2018), (Brock, Donahue, and Simonyan 2019) explored several variants designed to relax the constraint.…”
Section: Related Workmentioning
confidence: 99%
“…They proposed several orthogonal weight normalization methods to solve optimization over multiple dependent Stiefel manifolds. MHH-based methods (Liu et al 2018;Lin et al 2020) are inspired by the Thomson problem in physics and define the hyperspherical energy to characterize the diversity on a unit hypersphere and shows significant and consistent improvement in supervised learning tasks. Since the orthogonal regularization is too limiting (Miyato et al 2018), (Brock, Donahue, and Simonyan 2019) explored several variants designed to relax the constraint.…”
Section: Related Workmentioning
confidence: 99%
“…. Unrolling is shown useful in [12,41,46,47,58] and shares similar spirits with back-propagation through time in recurrent networks [85] and meta-learning [20]. The greedy policy is the optimal solution to Eq.…”
Section: Learning a Parameterized Teaching Policymentioning
confidence: 99%
“…Hyperspherical learning. Beyond face recognition, the idea of learning a representation on the hypersphere is also shown generally useful in a diverse set of applications, such as few-shot recognition [46], [47], [48], [49], [50], deep metric learning [26], [51], self-supervised learning [52], [53], [54], [55], generative models [56], [57], geometric learning [58], [59], [60], person re-identification [27], [61], [62], [63], speech processing [64], [65], [66], [67] and text processing [68]. It has been widely observed that constraining the embedding space on a hypersphere is beneficial to generalizability.…”
Section: Related Workmentioning
confidence: 99%