2021
DOI: 10.3390/s21051888
|View full text |Cite
|
Sign up to set email alerts
|

On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition

Abstract: Many speech emotion recognition systems have been designed using different features and classification methods. Still, there is a lack of knowledge and reasoning regarding the underlying speech characteristics and processing, i.e., how basic characteristics, methods, and settings affect the accuracy, to what extent, etc. This study is to extend physical perspective on speech emotion recognition by analyzing basic speech characteristics and modeling methods, e.g., time characteristics (segmentation, window type… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 18 publications
(9 citation statements)
references
References 41 publications
0
9
0
Order By: Relevance
“…A set of acoustic features needs to be determined for every SER application. Although many sets have been proposed and many studies agree on using specific domains, namely energy, pitch, prosody and cepstrum [ 51 ], the cross-linguistic nature of this study and the need for generalization called for a wide, non-standard set of features to then be reduced. The feature set of choice comes from the INTERSPEECH 2013 library [ 52 ], embedded in the feature extraction tool OpenSMILE (by Audeering) [ 53 ].…”
Section: Methodsmentioning
confidence: 99%
“…A set of acoustic features needs to be determined for every SER application. Although many sets have been proposed and many studies agree on using specific domains, namely energy, pitch, prosody and cepstrum [ 51 ], the cross-linguistic nature of this study and the need for generalization called for a wide, non-standard set of features to then be reduced. The feature set of choice comes from the INTERSPEECH 2013 library [ 52 ], embedded in the feature extraction tool OpenSMILE (by Audeering) [ 53 ].…”
Section: Methodsmentioning
confidence: 99%
“…Our proposed method may also result in errors owing to various noise environments. To overcome this problem, we aimed to reduce the number of features in the dataset by creating new features from existing features [ 69 ]. Since overfitting was one of the main issues for training different models during the competition, enriching the training data by adding data samples from different resources could be a possible solution for improving the results.…”
Section: Limitationsmentioning
confidence: 99%
“…Literature [22] puts forward that the semantic inclination of each sentence in the text can be obtained by weight first calculation method, emotional education is discussed on the combination of emotional words in college physical education, and the concept of the headword is put forward to calculate the inclination of words, which lays a foundation for more complicated emotional analysis of the text. In literature [23] through the big data analysis method, physical education teachers in colleges and universities should make full use of the advantages of disciplines in emotional education, pay attention to active and healthy emotional communication with students, win the respect and cooperation of students, and obtain the best educational effect. Literature [24] shows that the grades of praise and disapproval in college physical education can be divided into three categories (positive emotion, negative emotion, and neutral emotion) by star rating index, and the polarity classification of emotional education in comment text is completed by using the experimental algorithm using three classification methods, among which the method of the support-vector machine gets higher accuracy.…”
Section: Related Workmentioning
confidence: 99%