This paper proposes a method of automatic speaker-independent recognition of human psycho-emotional states by analyzing the speech signal based on Deep Learning technology to solve the problems of aviation profiling. For this purpose, an algorithm to classify seven human psycho-emotional states, including anger, joy, fear, surprise, disgust, sadness, and neutral state was developed. The algorithm is based on the use of Mel-frequency cepstral coefficients and Mel spectrograms as informative features of speech signals audio recordings. These informative features are used to train two deep convolutional neural networks on the generated dataset. The developed classifier testing on a delayed verification dataset showed that the metric for the multiclass fraction of correct answers’ accuracy is 0.93. The solution proposed in the paper can be in demand in human-machine interfaces creation, medicine, marketing, and in the field of air transportation.