Facial alignment involves finding a set of landmark points on an image with a known semantic meaning. However, this semantic meaning of landmark points is often lost in 2D approaches where landmarks are either moved to visible boundaries or ignored as the pose of the face changes. In order to extract consistent alignment points across large poses, the 3D structure of the face must be considered in the alignment step. However, extracting a 3D structure from a single 2D image usually requires alignment in the first place. We present our novel approach to simultaneously extract the 3D shape of the face and the semantically consistent 2D alignment through a 3D Spatial Transformer Network (3DSTN) to model both the camera projection matrix and the warping parameters of a 3D model. By utilizing a generic 3D model and a Thin Plate Spline (TPS) warping function, we are able to generate subject specific 3D shapes without the need for a large 3D shape basis. In addition, our proposed network can be trained in an end-to-end framework on entirely synthetic data from the 300W-LP dataset. Unlike other 3D methods, our approach only requires one pass through the network resulting in a faster than realtime alignment. Evaluations of our model on the Annotated Facial Landmarks in the Wild (AFLW) and AFLW2000-3D datasets show our method achieves state-of-the-art performance over other 3D approaches to alignment.
In this paper, we have proposed a robust, acceleration based, pace independent gait recognition framework using Android smartphones. From our extensive experiments using cyclostationarity and continuous wavelet transform spectrogram analysis on our gait acceleration database with both normal and fast paced data, our proposed algorithm has outperformed the state-of-the-art by a great margin. To be more specific, for normal to normal pace matching, we are able to achieve 99.4% verification rate (VR) at 0.1% false accept rate (FAR); for fast vs. fast, we are able to achieve 96.8% VR at 0.1% FAR; for the challenging normal vs. fast, we are still able to achieve 61.1% VR at 0.1% FAR. The findings have laid the foundation of pace independent gait recognition using mobile devices with high accuracy.
Robust face detection is one of the most important preprocessing steps to support facial expression analysis, facial landmarking, face recognition, pose estimation, building of 3D facial models, etc. Although this topic has been intensely studied for decades, it is still challenging due to numerous variants of face images in real-world scenarios. In this paper, we present a novel approach named Multiple Scale Faster Region-based Convolutional Neural Network (MS-FRCNN) to robustly detect human facial regions from images collected under various challenging conditions, e.g. large occlusions, extremely low resolutions, facial expressions, strong illumination variations, etc. The proposed approach is benchmarked on two challenging face detection databases, i.e. the Wider Face database and the Face Detection Dataset and Benchmark (FDDB), and compared against recent other face detection methods, e.g. Twostage CNN, Multi-scale Cascade CNN, Faceness, Aggregate Chanel Features, HeadHunter, Multi-view Face Detection, Cascade CNN, etc. The experimental results show that our proposed approach consistently achieves highly competitive results with the state-of-the-art performance against other recent face detection methods.
In this paper we explore how a Radio Frequency Impedance Interrogation (RFII) signal may be used as a biometric feature. This could allow the identification of subjects in operational and potentially hostile environments. Features extracted from the continuous and discrete wavelet decompositions of the signal are investigated for biometric identification. In the former case, the most discriminative features in the wavelet space were extracted using a Fisher ratio metric. Comparisons in the wavelet space were done using the Euclidean distance measure. In the latter case, the signal was decomposed at various levels using different wavelet bases, in order to extract both low frequency and high frequency components. Comparisons at each decomposition level were performed using the same distance measure as before. The data set used consists of four subjects, each with a 15 minute RFII recording. The various data samples for our experiments, corresponding to a single heart beat duration, were extracted from these recordings. We achieve identification rates of up to 99% using the CWT approach and rates of up to 100% using the DWT approach. While the small size of the dataset limits the interpretation of these results, further work with larger datasets is expected to develop better algorithms for subject identification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.