A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment

Valle, Roberto; Buenaposada, José Miguel; Valdés, Antonio; Baumela, Luis

doi:10.1007/978-3-030-01264-9_36

Cited by 92 publications

(70 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…We can observe that simple "MobileFAN" performs better than the state-of-the-art SAN [4], but the number of model parameters of "MobileFAN" is 28× smaller than that of SAN (we can see form TABLE 5). Although "MobileFAN + KD" does not outperform DCFE [29], it achieves comparable results to LAB [30] with extra boundary information on 300W Full set and Common subset. Using the knowledge distillation, our two full models are better than their corresponding baselines.…”

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 94%

“…However, both the two methods rely on the Hourglass, resulting in introducing a large number of parameters. Valle et al [29] used a simple CNN to generate heatmaps of landmark locations for a better initialization to Ensemble of Regression Trees (ERT) regressor.…”

Section: Facial Landmark Detectionmentioning

confidence: 99%

“…Significant improvements via deep Convolutional Neural Networks (CNNs) have been achieved on facial landmark detection recently [30,4,29], even though it remains a very challenging task when dealing with faces in real-world conditions (e.g., faces with unconstrained large pose variations and heavy occlusions). In order to guarantee promising performance in face alignment benchmarks, the majority of those works are designed to adopt large backbones (e.g., Hourglass [21] and ResNet-50 [10]), carefully designed schemes (e.g., a coarse-to-fine cascade regression framework [28]), or adding extra face structure information (e.g., face boundary information [30]).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

MobileFAN: Transferring deep hidden representation for face alignment

Zhao

Liu

Shen

et al. 2020

Pattern Recognition

View full text Add to dashboard Cite

Facial landmark detection is a crucial prerequisite for many face analysis applications. Deep learning-based methods currently dominate the approach of addressing the facial landmark detection. However, such works generally introduce a large number of parameters, resulting in high memory cost. In this paper, we aim for a lightweight as well as effective solution to facial landmark detection. To this end, we propose an effective lightweight model, namely Mobile Face Alignment Network (MobileFAN), using a simple backbone MobileNetV2 as the encoder and three deconvolutional layers as the decoder. The proposed Mobile-FAN, with only 8% of the model size and lower computational cost, achieves superior or equivalent performance compared with state-of-the-art models. Moreover, by transferring the geometric structural information of a face graph from a large complex model to our proposed MobileFAN through feature-aligned distillation and feature-similarity distillation, the performance of MobileFAN is further improved in effectiveness and efficiency for face alignment. Extensive experiment results on three challenging facial landmark estimation benchmarks including COFW, 300W and WFLW show the superiority of our proposed Mobile-FAN against state-of-the-art methods.

show abstract

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 94%

Section: Facial Landmark Detectionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

MobileFAN: Transferring deep hidden representation for face alignment

Zhao

Liu

Shen

et al. 2020

Pattern Recognition

View full text Add to dashboard Cite

show abstract

“…All faces are annotated by up to 21 landmarks per image, while the occluded landmarks were not labeled. For fair comparison with other methods we adopt the protocol from [76], which provides revised annotations with 19 [75] 3.92 2.68 CCL CVPR 16 [77] 2.72 2.17 TSR CVPR 17 [41] 2.17 -DAC-OSR CVPR 17 [19] 2.27 1.81 DCFE ECCV 18 [59] 2.17 -CPM+SBR CVPR 18 [15] 2.14 -SAN CVPR 18 [14] 1.91 1.85 DSRN CVPR 18 [46] 1.86 -LAB CVPR 18 [62] 1.85 1.62 Wing CVPR 18 [18] 1.65 -RCN + (L+ELT+A)CVPR 18 [26]…”

Section: Evaluation On Aflwmentioning

confidence: 99%

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression

Wang

Bo²,

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

248

149

View full text Add to dashboard Cite

Heatmap regression with a deep network has become one of the mainstream approaches to localize facial landmarks. However, the loss function for heatmap regression is rarely studied. In this paper, we analyze the ideal loss function properties for heatmap regression in face alignment problems. Then we propose a novel loss function, named Adaptive Wing loss, that is able to adapt its shape to different types of ground truth heatmap pixels. This adaptability penalizes loss more on foreground pixels while less on background pixels. To address the imbalance between foreground and background pixels, we also propose Weighted Loss Map, which assigns high weights on foreground and difficult background pixels to help training process focus more on pixels that are crucial to landmark localization. To further improve face alignment accuracy, we introduce boundary prediction and CoordConv with boundary coordinates. Extensive experiments on different benchmarks, including COFW, 300W and WFLW, show our approach outperforms the state-of-the-art by a significant margin on various evaluation metrics. Besides, the Adaptive Wing loss also helps other heatmap regression tasks. Code will be made publicly available at https://github.com/ protossw512/AdaptiveWingLoss.

show abstract

“…The error (NME) is normalized by the face bounding box size. Method AFLW-Full (%) LBF [20] 4.25 CFSS [32] 3.92 CCL (CVPR16) [33] 2.72 TSR (CVPR17) [13] 2.17 DCFE (ECCV18) [25] 2.17 SBR (CVPR18) [6] 2.14 DSRN (CVPR18) [16] 1.86 Wing (CVPR18) [7] 1.65 HGs 1.95 HGs + SA 1.62 HGs + SA + GHCU 1.60 GHCU considers the global face shape as constraint, being robust to such challenging factors.…”

Section: Comparison Experimentsmentioning

confidence: 99%

Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection

Liu

Zhu

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Recently, deep learning based facial landmark detection has achieved great success. Despite this, we notice that the semantic ambiguity greatly degrades the detection performance. Specifically, the semantic ambiguity means that some landmarks (e.g. those evenly distributed along the face contour) do not have clear and accurate definition, causing inconsistent annotations by annotators. Accordingly, these inconsistent annotations, which are usually provided by public databases, commonly work as the groundtruth to supervise network training, leading to the degraded accuracy. To our knowledge, little research has investigated this problem. In this paper, we propose a novel probabilistic model which introduces a latent variable, i.e. the 'real' ground-truth which is semantically consistent, to optimize. This framework couples two parts (1) training landmark detection CNN and (2) searching the 'real' groundtruth. These two parts are alternatively optimized: the searched 'real' ground-truth supervises the CNN training; and the trained CNN assists the searching of 'real' groundtruth. In addition, to recover the unconfidently predicted landmarks due to occlusion and low quality, we propose a global heatmap correction unit (GHCU) to correct outliers by considering the global face shape as a constraint. Extensive experiments on both image-based (300W and AFLW) and video-based (300-VW) databases demonstrate that our method effectively improves the landmark detection accuracy and achieves the state of the art performance.

show abstract

A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment

Cited by 92 publications

References 36 publications

MobileFAN: Transferring deep hidden representation for face alignment

MobileFAN: Transferring deep hidden representation for face alignment

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression

Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection

Contact Info

Product

Resources

About