2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00556
|View full text |Cite
|
Sign up to set email alerts
|

FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos

Abstract: Recently, facial landmark detection algorithms have achieved remarkable performance on static images. However, these algorithms are neither accurate nor stable in motion-blurred videos. The missing of structure information makes it difficult for state-of-the-art facial landmark detection algorithms to yield good results.In this paper, we propose a framework named FAB that takes advantage of structure consistency in the temporal dimension for facial landmark detection in motionblurred videos. A structure predic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 32 publications
(26 citation statements)
references
References 41 publications
0
26
0
Order By: Relevance
“…The proposed method is compared to other state-of-the-art landmark localization methods. From these approaches, coordinate regression methods include SDM [35], TSCN [26], IFA [1], CFSS [40], TCDCN [38], TSTN [14], DSRN [17], ODN [39], STA [30], Sun et al's work [27] and GAN [36]. Heatmap regression methods include Newell et al's work [18], SAN [9], LAB [32], CNN-CRF [5], LaplaceKL [23], Sun et al's work [28], DSNT [19] , DARK [37], FHR [30], GHCU [15] [10,11,21] for landmark detection are trained under different conditions with our method so are not included in our comparison.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…The proposed method is compared to other state-of-the-art landmark localization methods. From these approaches, coordinate regression methods include SDM [35], TSCN [26], IFA [1], CFSS [40], TCDCN [38], TSTN [14], DSRN [17], ODN [39], STA [30], Sun et al's work [27] and GAN [36]. Heatmap regression methods include Newell et al's work [18], SAN [9], LAB [32], CNN-CRF [5], LaplaceKL [23], Sun et al's work [28], DSNT [19] , DARK [37], FHR [30], GHCU [15] [10,11,21] for landmark detection are trained under different conditions with our method so are not included in our comparison.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…High temporal resolution: ECs can provide up to MHz sampling resolution with high speed input motion, and with proportionally low latency. The stream of events from ECs does not suffer from motion blur [ 54 ], which is often observed in images of fast head rotations, on the mouth during speech, or due to camera motion, avoiding the need to implement costly de-blurring in face alignment [ 12 ]. ECs are therefore suitable in applications where motion provide the most relevant information, as in facial action recognition, voice activity detection and visual speech recognition, that must be robust to face pose variations.…”
Section: Discussionmentioning
confidence: 99%
“…At first, the detection of pose change activates the alignment, preventing unnecessary processing when the head doesn’t move. Alignment is then performed by regression cascade of tree ensembles, exploiting their superior computational efficiency [ 9 , 10 ] with respect to (possibly more accurate) state-of-the-art alignment methods based on deep neural networks (DNNs) [ 5 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 ]. In a scenario where energy efficiency is a matter of the utmost importance, a DNN-based alignment pre-processor might eclipse the energy efficiency advantage of ECs.…”
Section: Introductionmentioning
confidence: 99%
“…This is because FHR does not consider inter-frame temporal dependency, so it has difficulties addressing the problem of heavy occlusions in motion. FAB [40] aims to handle motion-blurred videos by utilizing eight residual blocks to build an hourglass network for predicting boundary maps. Additionally, two convolutional layers and four residual blocks are used to generate a de-blurred sharp image, which form a pre-activated Resnet-18 as FAB's replaceable facial landmark detection network for landmark detection.…”
Section: Related Workmentioning
confidence: 99%