This paper presents an approach to enhance the performance of appearance and shape based feature fusion for face recognition system using Principal Component Analysis and Hidden Markov Model. Though the traditional face recognition system is very sensitive to the face parameter variations, the proposed feature fusion based facial recognition system is found to be stance and performs well for improving the robustness and naturalness of human-computer-interaction. Active Appearance Model and Shape Model have been used to extract the appearance and shape based facial features from the facial images. Feature fusion is performed and combined feature vector is created using these two types of features. To reduce the dimensionality of the feature vector, Principal component Analysis method has been used. Hidden Markov Model has been used for learning and classification purpose. In experimental result, appearance based, shape based and combined appearance and shape based output are reported and shows the superiority of the proposed facial recognition system.
It is well known to enhance the performance of noise robust speaker identification using visual speech information with audio utterances. This paper presents an approach to evaluate the performance of a noise robust audio-visual speaker identification system using likelihood ratio based score fusion in challenging environment. Though the traditional HMM based audio-visual speaker identification system is very sensitive to the speech parameter variation, the proposed likelihood ratio based score fusion method is found to be stance and performs well for improving the robustness and naturalness of human-computerinteraction. In this paper, we investigate the proposed audiovisual speaker identification system in typical office environments conditions. To do this, we investigated two approaches that utilize speech utterance with visual features to improve speaker identification performance in acoustically and visually challenging environment: one seeks to eliminate the noise from the acoustic and visual features by using speech and facial image pre-processing techniques. The other task combines speech and facial features that have been used by the multiple Discrete Hidden Markov Model classifiers with likelihood ratio based score fusion. It is shown that the proposed system can improve a significant amount of performance for audio-visual speaker identification in challenging official environment conditions.
The contribution of this paper is to propose a novel approach of evaluating the performance of a noise robust audio-visual speaker identification system in challenging environment. Though the traditional HMM based audio-visual speaker identification system is very sensitive to the speech parameter variation, the proposed hybrid feature and decision fusion based audio-visual speaker identification is found to be stance and performs well for improving the robustness and naturalness of human-computerinteraction. Linear Prediction Cepstral Coefficients and Mel Frequency Cepstral Coefficients are used to extract the audio features and Active Appearance Model and Active Shape Model have been used to extract the appearance and shape based features for the facial image. Principal Component Analysis method has been used to reduce the dimensionality of large feature vector and to normalize, the vector normalization algorithm has been used. Features and decision both are fused in two different levels and finally four different classifier outputs are combined in parallel fashion to achieve the identification result. The performances of all these uni-modal and multi-modal system performance have been evaluated and compared with each other on VALID audiovisual multi-modal database, containing both vocal and visual biometric modalities.
In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Hidden Markov Model technique with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point detection algorithm, pre-emphasis filtering, frame blocking and windowing have been used to process the speech utterances. RCC, MFCC, ΔMFCC, ΔΔMFCC, LPC and LPCC have been used to extract the features. After parameterization of the speech, Discrete Hidden Markov Model has been used in the learning and identification purposes. Features are extracted by using different techniques to optimize the performance of the identification. The performance of this identification is almost different in each case. The highest speaker identification rate of 93[%] for noiseless environment and 69.27[%] for noisy environment have been achieved in the close set text dependent speaker identification system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.