2013
DOI: 10.1016/j.csl.2012.01.007
|View full text |Cite
|
Sign up to set email alerts
|

Speaker state recognition using an HMM-based feature extraction method

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(10 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…A work on Speaker state recognition using an HMM-based feature extraction method [4] done by R. Gajˇsek * , F. Miheliˇc, S. Dobriˇsek. This system uses acoustic features for recognizing various paralinguistic phenomena.…”
Section: Related Workmentioning
confidence: 99%
“…A work on Speaker state recognition using an HMM-based feature extraction method [4] done by R. Gajˇsek * , F. Miheliˇc, S. Dobriˇsek. This system uses acoustic features for recognizing various paralinguistic phenomena.…”
Section: Related Workmentioning
confidence: 99%
“…Still, most of the aforementioned phoneme-level modeling emotion classification techniques used forced alignment or manual annotation for the extraction of the phoneme borders. Just some methods faced real-life conditions by using ASR engines for generating the phoneme alignment [16]. Current ASR techniques are not able to provide as good phoneme alignment on affective speech samples as manual annotation or forced alignment.…”
Section: Introductionmentioning
confidence: 99%
“…According to Batliner et al [10] words can be seen as the smallest possible chunk for analysis. A comparably smaller number of classification techniques are based on phonetic pattern modeling within emotion classification [11], [12], [13], [14], [15], [16], [17]. Still, most of the aforementioned phoneme-level modeling emotion classification techniques used forced alignment or manual annotation for the extraction of the phoneme borders.…”
Section: Introductionmentioning
confidence: 99%
“…During recognition the likelihood of generating the tested sequence of observations by a given HMM is calculated, and the emotion with the highest likelihood is selected. Monophone-based HMMs were also proposed for modeling frame-level acoustic features by Gajsek et al (2013), who achieved classification improvement both in emotion recognition and the alcohol detection, compared with the other state-of-the-art methods. Usually HMMs with Gaussian outputs are used.…”
Section: 4mentioning
confidence: 99%