2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01014
|View full text |Cite
|
Sign up to set email alerts
|

P2SGrad: Refined Gradients for Optimizing Deep Face Models

Abstract: Cosine-based softmax losses [20,29,27,3] significantly improve the performance of deep face recognition networks. However, these losses always include sensitive hyper-parameters which can make training process unstable, and it is very tricky to set suitable hyper parameters for a specific dataset. This paper addresses this challenge by directly designing the gradients for adaptively training deep neural networks. We first investigate and unify previous cosine softmax losses by analyzing their gradients. This u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 35 publications
(23 citation statements)
references
References 32 publications
0
23
0
Order By: Relevance
“…This loss is a mean-square error (MSE) between a network's output and a scalar target value, but the network's output here is a cosine distance cos θ j,k rather than an unconstrained value. The gradient ∂L (p2s) /∂oj can be shown to be identical to the probability-to-similarity gradient (P2SGrad) [8], and this points to the same optimal direction as the margin-based softmax. We refer to the new loss function as MSE for P2SGrad.…”
Section: New Mean-square-error Loss Function With P2sgradmentioning
confidence: 82%
See 3 more Smart Citations
“…This loss is a mean-square error (MSE) between a network's output and a scalar target value, but the network's output here is a cosine distance cos θ j,k rather than an unconstrained value. The gradient ∂L (p2s) /∂oj can be shown to be identical to the probability-to-similarity gradient (P2SGrad) [8], and this points to the same optimal direction as the margin-based softmax. We refer to the new loss function as MSE for P2SGrad.…”
Section: New Mean-square-error Loss Function With P2sgradmentioning
confidence: 82%
“…The discriminative power of the margin-based softmax has been testified in a few speech [27] and image processing tasks [3], but its performance is sensitive to the hyper-parameter setting [8].…”
Section: New Mean-square-error Loss Function With P2sgradmentioning
confidence: 99%
See 2 more Smart Citations
“…We further incorporated modifications to above architecture pertaining to face recognition domain. We set 112 × 112 image resolution for teacher network whereas different resolutions of 96 × 96, IJB-C 1:1 TAR (in %)@FAR Method 10 −4 10 −5 Crystal Loss [25] 92.50 87.75 P2SGrad [31] 92.25 87.84 Fixed AdaCos [30] 92.35 87.87 Dynamic AdaCos [30] 92.40 88.03 ArcFace [8] 95.65 93.15 Ours 96.39 94.20…”
Section: Training Detailsmentioning
confidence: 99%