This article discusses the classification algorithms for the problem of personality identification by voice using machine learning methods. We used the MFCC algorithm in the speech preprocessing process. To solve the problem, a comparative analysis of five classification algorithms was carried out. In the first experiment, the support vector method was determined-0.90 and multilayer perceptron-0.83, that showed the best results. In the second experiment, a multilayer perceptron with an accuracy of 0.93 was proposed using the Robust scaler method for personal identification. Therefore, to solve this problem, it is possible to use a multi-layer perceptron, taking into account the specifics of the speech signal.
In this paper, we investigate two neural architecture for gender detection and speaker identification tasks by utilizing Mel-frequency cepstral coefficients (MFCC) features which do not cover the voice related characteristics. One of our goals is to compare different neural architectures, multi-layers perceptron (MLP) and, convolutional neural networks (CNNs) for both tasks with various settings and learn the gender/speaker-specific features automatically. The experimental results reveal that the models using z-score and Gramian matrix transformation obtain better results than the models only use max-min normalization of MFCC. In terms of training time, MLP requires large training epochs to converge than CNN. Other experimental results show that MLPs outperform CNNs for both tasks in terms of generalization errors.
In the area of voice recognition, many methods have been proposed over time. Automatic speaker recognition technology has reached a good level of performance, but still needs to be improved. Signature verification (SV) is one of the most common methods of identity verification in the banking sector, where for security reasons, it is very important to have an accurate method for automatic signature verification (ASV). ASV is usually solved by comparing a test signature with a registration signature(-s) signed by the person whose identity is declared in two ways: online and offline. In this study, a new ivector based method is proposed for SV online. In the proposed method, a fixed-length vector, called an i-vector, is extracted from each signature, and then this vector is used to create a template. Several methods, such as the nuisance attribute projection and the within-class covariance normalization, are also being investigated to reduce the intra-class variation in the i-vector space. At the stage of evaluation and decision-making, they also propose to apply the support vector machine with two classes. In this article, a new low-dimensional space, depending on the dynamics and the channel, is determined using a simple factor analysis, also known as i-vector. I-vectors have proven to be the most efficient functions for text independent speaker verification in recent studies.
This article describes the methods of creating a system of recognizing the continuous speech of Kazakh language. Studies on recognition of Kazakh speech in comparison with other languages began relatively recently, that is after obtaining independence of the country, and belongs to low resource languages. A large amount of data is required to create a reliable system and evaluate it accurately. A database has been created for the Kazakh language, consisting of a speech signal and corresponding transcriptions. The continuous speech has been composed of 200 speakers of different genders and ages, and the pronunciation vocabulary of the selected language. Traditional models and deep neural networks have been used to train the system. As a result, a word error rate (WER) of 30.01% has been obtained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.