2016 Fourth International Conference on 3D Vision (3DV) 2016
DOI: 10.1109/3dv.2016.56
|View full text |Cite
|
Sign up to set email alerts
|

3D Face Reconstruction by Learning from Synthetic Data

Abstract: Fast and robust three-dimensional reconstruction of facial geometric structure from a single image is a challenging task with numerous applications. Here, we introduce a learning-based approach for reconstructing a threedimensional face from a single image. Recent face recovery methods rely on accurate localization of key characteristic points. In contrast, the proposed approach is based on a Convolutional-Neural-Network (CNN) which extracts the face geometry directly from its image. Although such deep archite… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
290
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 311 publications
(291 citation statements)
references
References 30 publications
1
290
0
Order By: Relevance
“…One of the main challenges is in how to collect enough training images labelled with their corresponding 3D faces, to feed the network. Richardson et al [118], [119] generate synthetic training data by drawing random samples from the morphable model and rendering the resulting faces. However, a network trained on purely synthetic data may perform poorly when faced with occlusions, unusual lighting, or ethnicities that are not well represented.…”
Section: Training and Supervisionmentioning
confidence: 99%
“…One of the main challenges is in how to collect enough training images labelled with their corresponding 3D faces, to feed the network. Richardson et al [118], [119] generate synthetic training data by drawing random samples from the morphable model and rendering the resulting faces. However, a network trained on purely synthetic data may perform poorly when faced with occlusions, unusual lighting, or ethnicities that are not well represented.…”
Section: Training and Supervisionmentioning
confidence: 99%
“…which can be understood as the combination of the Geometric Mean Squared Error (GMSE) defined in [23] and used for learning the geometry, and the cost defined in [20] and used for learning the camera pose. The best models trained with L Coarse and L XQT are obtained after a Bayesian optimization to estimate the learning rate and α and {β, γ}, respectively.…”
Section: Quantitative Evaluationmentioning
confidence: 99%
“…Recent works use CNN frameworks to recover detailed 3D models from a single input photograph [30,20,21,25]. To represent the solution space a linear 3DMM is used, which restricts the solution from being very detailed.…”
Section: Related Workmentioning
confidence: 99%
“…The regression from the 2D heigthmap to the model coefficients is implemented using ResNet-18 [14], a state-of-the-art architecture which has recently shown very good performances in face-related problems [20,25]. The CNN reduces the image to a 256-dimensional vector, after which three fully-connected layers perform the regression towards the coefficient vector w T of the specified dimensions.…”
Section: Cnn Encodermentioning
confidence: 99%