In this paper we present the results of the Unconstrained Ear Recognition Challenge (UERC), a group benchmarking effort centered around the problem of person recognition from ear images captured in uncontrolled conditions. The goal of the challenge was to assess the performance of existing ear recognition techniques on a challenging large-scale dataset and identify open problems that need to be addressed in the future. Five groups from three continents participated in the challenge and contributed six ear recognition techniques for the evaluation, while multiple baselines were made available for the challenge by the UERC organizers. A comprehensive analysis was conducted with all participating approaches addressing essential research questions pertaining to the sensitivity of the technology to head rotation, flipping, gallery size, large-scale recognition and others. The top performer of the UERC was found to ensure robust performance on a smaller part of the dataset (with 180 subjects) regardless of image characteristics, but still exhibited a significant performance drop when the entire dataset comprising 3, 704 subjects was used for testing.
Here, the authors have extensively investigated the unconstrained ear recognition problem. The authors have first shown the importance of domain adaptation, when deep convolutional neural network (CNN) models are used for ear recognition. To enable domain adaptation, the authors have collected a new ear data set using the Multi-PIE face data set, which they named as Multi-PIE ear data set. The authors have analysed in depth the effect of ear image quality, for example, illumination and aspect ratio, on the classification performance. Finally, the authors have addressed the problem of data set bias in the ear recognition field. Experiments on the UERC data set have shown that domain adaptation leads to a significant performance improvement. For example, when VGG-16 model is used and the domain adaptation is applied, an absolute increase of around 10% has been achieved. Combining different deep CNN models has further improved the accuracy by 4%. In the experiments that the authors have conducted to examine the data set bias, given an ear image, they were able to classify the data set that it has come from with 99.71% accuracy, which indicates a strong bias among the ear recognition data sets.
In this paper, we present a detailed analysis on extracting soft biometric traits, age and gender, from ear images. Although there have been a few previous work on gender classification using ear images, to the best of our knowledge, this study is the first work on age classification from ear images. In the study, we have utilized both geometric features and appearance-based features for ear representation. The utilized geometric features are based on eight anthropometric landmarks and consist of 14 distance measurements and two area calculations. The appearance-based methods employ deep convolutional neural networks for representation and classification. The well-known convolutional neural network models, namely, AlexNet, VGG-16, GoogLeNet, and SqueezeNet have been adopted for the study. They have been fine-tuned on a large-scale ear dataset that has been built from the profile and close-to-profile face images in the Multi-PIE face dataset. This way, we have performed a domain adaptation. The updated models have been fine-tuned once more time on the small-scale target ear dataset, which contains only around 270 ear images for training. According to the experimental results, appearance-based methods have been found to be superior to the methods based on geometric features. We have achieved 94% accuracy for gender classification, whereas 52% accuracy has been obtained for age classification. These results indicate that ear images provide useful cues for age and gender classification, however, further work is required for age estimation.
In this paper, we present multimodal deep neural network frameworks for age and gender classification, which take input a profile face image as well as an ear image. Our main objective is to enhance the accuracy of soft biometric trait extraction from profile face images by additionally utilizing a promising biometric modality: ear appearance. For this purpose, we provided end-to-end multimodal deep learning frameworks. We explored different multimodal strategies by employing data, feature, and score level fusion. To increase representation and discrimination capability of the deep neural networks, we benefited from domain adaptation and employed center loss besides softmax loss. We conducted extensive experiments on the UND-F, UND-J2, and FERET datasets. Experimental results indicated that profile face images contain a rich source of information for age and gender classification. We found that the presented multimodal system achieves very high age and gender classification accuracies. Moreover, we attained superior results compared to the state-of-the-art profile face image or ear image-based age and gender classification methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.