This paper proposes a method of automatic speaker-independent recognition of human psycho-emotional states by analyzing the speech signal based on Deep Learning technology to solve the problems of aviation profiling. For this purpose, an algorithm to classify seven human psycho-emotional states, including anger, joy, fear, surprise, disgust, sadness, and neutral state was developed. The algorithm is based on the use of Mel-frequency cepstral coefficients and Mel spectrograms as informative features of speech signals audio recordings. These informative features are used to train two deep convolutional neural networks on the generated dataset. The developed classifier testing on a delayed verification dataset showed that the metric for the multiclass fraction of correct answers’ accuracy is 0.93. The solution proposed in the paper can be in demand in human-machine interfaces creation, medicine, marketing, and in the field of air transportation.
This article addresses the problem of developing an effective method for automatically classifying the aviation personnel emotions (announcer) by voice. To this end, it is possible to create a dictatorial independent algorithm capable of performing a multi-grade classification of the seven emotional states of a person (joy, fear, anger, sadness, disgust, surprise and neutrality) on the basis of a set of 48 informative features. These features are formed from the digital recording of the speech signal by calculating Mel Frequency Cepstral coefficient and the main tone frequency for individual recording frames. The increase of informativeness and the reduction of the dimension for the Mel Frequency Cepstral coefficient is achieved by processing said coefficients with the aid of a deep, convergent neural network. The model of the classifier is realized by means of logistic regression, which was trained on the basis of emotionally colored English speech samples by these informative features. As a result of the training on the test sample, the correct recognition response accuracy is equal to 0.96. The inventive solution can be used for improving human-machine interfaces, as well as in the field of aviation, medicine, marketing etc.
The relevance of the study is due to the fact that not a single sovereign country agrees to allow uncontrolled commercial activity of foreign airlines on its territory. Hence the need for a clear international legal regulation of the rights to carry out such commercial activities. In this context, the article aims to analyze the main forms and methods of commercial activity. Leading approach to the study of this problem is the descriptive method that has afforded revealing peculiarities of terms of commercial agreements and proposed air fares. The materials of the paper imply the practical significance for the university teachers of the economic and legal specializations.
В данной статье рассматривается проблема разработки эффективного метода автоматической классификации эмоций авиационного персонала (диктора) по голосу. Для этого решается задача по созданию дикторонезависимого алгоритма, способного выполнять многоклассовую классификацию семи эмоциональных состояний человека (радость, страх, гнев, печаль, отвращение, удивление и нейтральное состояние) на основании набора из 48 информативных признаков. Данные признаки формируются из цифровой записи речевого сигнала путем расчета мел-частотных кепстральных коэффициентов и частоты основного тона для отдельных фреймов звукозаписи. Повышение информативности и снижение размерности для мел-частотных кепстральных коэффициентов выполняется за счет их обработки при помощи глубокой сверточной нейронной сети. Модель классификатора реализована при помощи логистической регрессии, которая обучалась по указанным информативным признакам на базе записей эмоционально окрашенных образцов английской речи. В результате обучения на тестовой выборке доля правильных ответов распознавания составляет accuracy = 0,96. Предложенное в работе решение может быть использовано для улучшения человеко-машинных интерфейсов, а также в области авиационных перевозок, медицине, маркетинге и пр.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.