Continuous Conditional Neural Fields for Structured Regression

Baltrušaitis, Tadas; Robinson, Peter; Morency, Louis–Philippe

doi:10.1007/978-3-319-10593-2_39

Cited by 75 publications

(63 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…1 shows an overview of our GazeNet method based on a multimodal convolutional neural network (CNN). We first use state-of-the-art face detection [64] and facial landmark detection [62] methods to locate landmarks in the input image obtained from the calibrated monocular RGB camera. We then fit a generic 3D facial shape model to estimate 3D poses of the detected faces and apply the space normalisation technique proposed in [6] to crop and warp the head pose and eye images to the normalised training space.…”

Section: Facial Landmark Annotationmentioning

confidence: 99%

“…We discard all images in which the detector fails to find any face, which happened in about 5% of all cases. Afterwards, we use a continuous conditional neural fields (CCNF) model framework to detect facial landmarks [62]. While previous works assumed accurate head poses, we use a generic mean facial shape model F for the 3D pose estimation to evaluate the whole gaze estimation pipeline in a practical setting.…”

Section: Face Alignment and 3d Head Pose Estimationmentioning

confidence: 99%

“…Our method is based on the 16-layer VGGNet architecture [66] that includes 13 convolutional layers, two fully connected layers, and one classification layer with five max pooling layers in between. Following prior work on face [62], [67] and gaze [41], [68] analysis, we use a grey-scale single channel image as input with a resolution of 60 × 36 pixels. We changed the stride of the first and second pooling layer from two to one to reflect the smaller input resolution.…”

Section: Gazenet Architecturementioning

confidence: 99%

See 2 more Smart Citations

MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

Zhang

Sugano

Fritz

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

417

378

View full text Add to dashboard Cite

Abstract-Learning-based methods are believed to work well for unconstrained gaze estimation, i.e. gaze estimation from a monocular RGB camera without assumptions regarding user, environment, or camera. However, current gaze datasets were collected under laboratory conditions and methods were not evaluated across multiple datasets. Our work makes three contributions towards addressing these limitations. First, we present the MPIIGaze dataset, which contains 213,659 full face images and corresponding ground-truth gaze positions collected from 15 users during everyday laptop use over several months. An experience sampling approach ensured continuous gaze and head poses and realistic variation in eye appearance and illumination. To facilitate cross-dataset evaluations, 37,667 images were manually annotated with eye corners, mouth corners, and pupil centres. Second, we present an extensive evaluation of state-of-the-art gaze estimation methods on three current datasets, including MPIIGaze. We study key challenges including target gaze range, illumination conditions, and facial appearance variation. We show that image resolution and the use of both eyes affect gaze estimation performance, while head pose and pupil centre information are less informative. Finally, we propose GazeNet, the first deep appearance-based gaze estimation method. GazeNet improves on the state of the art by 22% (from a mean error of 13.9 degrees to 10.8 degrees) for the most challenging cross-dataset evaluation.

show abstract

Section: Facial Landmark Annotationmentioning

confidence: 99%

Section: Face Alignment and 3d Head Pose Estimationmentioning

confidence: 99%

Section: Gazenet Architecturementioning

confidence: 99%

See 1 more Smart Citation

MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

Zhang

Sugano

Fritz

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

417

378

View full text Add to dashboard Cite

show abstract

“…This can be done by using a shape model to either restrict the search region (see [20]), or by correcting the estimates obtained during the local search. Typical shape models include the Constrained Local Model (CLM) [3], [5], [10], [12], [28], the tree-structured model [18], [41], [44], [46]. Other optimization search methods are also applied to search for the best combination of the multiple local candidates, e.g.…”

Section: Related Workmentioning

confidence: 99%

Robust Face Alignment Under Occlusion via Regional Predictive Power Estimation

Yang

Jia

et al. 2015

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

Face alignment has been well studied in recent years, however, when a face alignment model is applied on facial images with heavy partial occlusion, the performance deteriorates significantly. In this paper, instead of training an occlusion-aware model with visibility annotation, we address this issue via a model adaptation scheme that uses the result of a local regression forest (RF) voting method. In the proposed scheme, the consistency of the votes of the local RF in each of several oversegmented regions is used to determine the reliability of predicting the location of the facial landmarks. The latter is what we call regional predictive power (RPP). Subsequently, we adapt a holistic voting method (cascaded pose regression based on random ferns) by putting weights on the votes of each fern according to the RPP of the regions used in the fern tests. The proposed method shows superior performance over existing face alignment models in the most challenging data sets (COFW and 300-W). Moreover, it can also estimate with high accuracy (72.4% overlap ratio) which image areas belong to the face or nonface objects, on the heavily occluded images of the COFW data set, without explicit occlusion modeling.

show abstract

“…Due to a relatively small number of parameters, the optimization could be done jointly with Conditional Random Fields (CRFs). Another implementation of CRFs with NNs, a structured regression model using continuous outputs, called Continuous Conditional Random Fields (CCNFs), has been proposed in [22]. However, these methods fail to account for ordinal information inherent to the intensity levels.…”

Section: Modelling Approachesmentioning

confidence: 99%

Neural conditional ordinal random fields for agreement level estimation

Rakicevic

Rudovic

Petridis

et al. 2015

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

View full text Add to dashboard Cite

Abstract-We present a novel approach to automated estimation of agreement intensity levels from facial images. To this end, we employ the MAHNOB Mimicry database of subjects recorded during dyadic interactions, where the facial images are annotated in terms of agreement intensity levels using the Likert scale (strong disagreement, disagreement, neutral, agreement and strong agreement). Dynamic modelling of the agreement levels is accomplished by means of a Conditional Ordinal Random Field model. Specifically, we propose a novel Neural Conditional Ordinal Random Field model that performs non-linear feature extraction from face images using the notion of Neural Networks, while also modelling temporal and ordinal relationships between the agreement levels. We show in our experiments that the proposed approach outperforms existing methods for modelling of sequential data. The preliminary results obtained on five subjects demonstrate that the intensity of agreement can successfully be estimated from facial images (39% F1 score) using the proposed method.

show abstract

Continuous Conditional Neural Fields for Structured Regression

Cited by 75 publications

References 29 publications

MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

Robust Face Alignment Under Occlusion via Regional Predictive Power Estimation

Neural conditional ordinal random fields for agreement level estimation

Contact Info

Product

Resources

About