Head pose estimation methods evaluate the amount of head rotation according to two or three axes, aiming at optimizing the face acquisition process, or extracting neutral-pose frames from a video sequence. Most approaches to pose estimation exploits machine-learning techniques requiring a training phase on a large number of positive and negative examples. In this paper, a novel pose estimation method that exploits a quad-tree-based representation of facial features is described. The locations of a set of landmarks detected over the face image guide its subdivision into smaller and smaller quadrants based on the presence or lack of landmarks within each quadrant. The proposed pose descriptor is both effective and efficient, providing accurate yaw, pitch and roll axis estimates almost in real-time, without need for any training or previous knowledge about the subject. The experiments conducted on both the BIWI Kinect Head Pose Database and the challenging automated facial landmarks in the wild dataset, highlight a pose estimate precision exceeding the state-of-the-art with regard to methods not involving training and machine learning approaches. INDEX TERMS Biometrics, face recognition, image analysis.
Despite the success obtained in face detection and recognition over the last ten years of research, the analysis of facial attributes still represents a trend topic. Keeping the full face recognition aside, exploring the potentials of soft biometric traits, i.e. singular facial traits like the nose, the mouth, the hair and so on, is yet considered a fruitful field of investigation. Being able to infer the identity of an occluded face, e.g. voluntary occluded by sunglasses or accidentally due to environmental factors, can be useful in a wide range of operative fields where user collaboration cannot be considered as an assumption. This especially happens when dealing with forensic scenarios in which is not unusual to have partial face photos or partial fingerprints. In this paper, an unsupervised clustering approach is described. It consists in a neural network model for face attributes recognition based on transfer learning whose goal is grouping faces according to common facial features. Moreover, we use the features collected in each cluster to provide a compact and comprehensive description of the faces belonging to each cluster and deep learning as a mean for task prediction in partially visible faces.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.