2016 International Conference on Communication and Signal Processing (ICCSP) 2016
DOI: 10.1109/iccsp.2016.7754275
|View full text |Cite
|
Sign up to set email alerts
|

Speech emotion recognition based on minimal voice quality features

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 19 publications
(10 citation statements)
references
References 3 publications
0
10
0
Order By: Relevance
“…These algorithms belong to supervised learning methods, and have achieved certain success, but their recognition efficiency is not high. Therefore, how to improve the feature extraction efficiency, obtain potential features, and enhance the recognition rate are still a research field that needs to be further studied [24].…”
Section: Speech Feature Extractormentioning
confidence: 99%
“…These algorithms belong to supervised learning methods, and have achieved certain success, but their recognition efficiency is not high. Therefore, how to improve the feature extraction efficiency, obtain potential features, and enhance the recognition rate are still a research field that needs to be further studied [24].…”
Section: Speech Feature Extractormentioning
confidence: 99%
“…For f 0 in the range [200, 1000] Hz, k in the range [0, 0.4] and f mod in the range [5,10] Hz it was necessary to create a more complex model than the previous one. Shimmer depends on the modulation frequency, so a new transformation is necessary (the first one was the transformation from time domain to frequency domain in the spectrogram).…”
Section: Shimmer Approximation With F 0 K and F Mod Variablementioning
confidence: 99%
“…Shimmer value is associated to voice quality [2][3][4][5][6][7], state of mind [8][9][10][11][12][13], age [14] and gender [15] of people. There are many research works that use shimmer (among other measures) with goals ranging from pathologies detection [6,16,17] to the improvement of human-machine interfaces through the estimation of the intensionality of a spoken phrase [19].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Among them, the MFCC is the most commonly used spectral feature for SER because of its characteristics being similar to the human ear. Sound quality features have a great influence on the emotional state expressed in speech, and it mainly includes sounds from breathing, brightness and formant [29]. Acoustic features are typically extracted in frames and enable emotional recognition through simple statistical analysis.…”
Section: Introductionmentioning
confidence: 99%