8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019) 2019
DOI: 10.21437/slate.2019-9
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic correlates of speech intelligibility: the usability of the eGeMAPS feature set for atypical speech

Abstract: Although speech intelligibility has been studied in different fields such as speech pathology, language learning, psycholinguistics, and speech synthesis, it is still unclear which concrete speech features most impact intelligibility. Commonly used subjective measures of speech intelligibility based on labour-intensive human ratings are time-consuming and expensive, so objective procedures based on automatically calculated features are needed. In this paper, we investigate possible correlations between a set o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 16 publications
(7 citation statements)
references
References 25 publications
1
6
0
Order By: Relevance
“…With respect to the classification of speakers as either dysarthric or non-dysarthric, the results suggest that the intelligibility measures assigned by human raters and the probabilities computed through the objective procedure based on acoustic-phonetic features are partly complementary to each other, as also found by Bunton et al [22]. These results are also in line with previous findings that acoustic-phonetic features have correlations to speaker types or to speech intelligibility [17,22,24], to a certain extent.…”
Section: Discussionsupporting
confidence: 89%
See 1 more Smart Citation
“…With respect to the classification of speakers as either dysarthric or non-dysarthric, the results suggest that the intelligibility measures assigned by human raters and the probabilities computed through the objective procedure based on acoustic-phonetic features are partly complementary to each other, as also found by Bunton et al [22]. These results are also in line with previous findings that acoustic-phonetic features have correlations to speaker types or to speech intelligibility [17,22,24], to a certain extent.…”
Section: Discussionsupporting
confidence: 89%
“…Bunton et al [22] found that a restricted intensity range tended to be associated with reduced speech intelligibility in amyotrophic lateral sclerosis speakers with moderate intelligibility. Xue et al [24] investigated the usability of the eGeMAPS feature set, which contains the three mentioned features, for predicting speech intelligibility at phoneme level. Their results indicated that this feature set is potentially usable and revealed important differences between dysarthric speech and non-dysarthric speech.…”
Section: Introductionmentioning
confidence: 99%
“…The openSMILE toolkit performs the extraction of acoustic parameters that describe the paralinguistic characteristics of the speech signal. Based on the previous success that these acoustic parameters were able to assess the personality [ 28 ], detect a speech-related disease [ 29 ] and identify the gender and age [ 30 ] of a person, we deployed the acoustic parameter sets defined in eGeMAPS and ComParE to facilitate the classification of the physical load based on speech signals [ 31 , 32 ].…”
Section: Discussionmentioning
confidence: 99%
“…In particular, prosody features include fundamental frequency (F0), intensity measures, and voicing probabilities, as these have been widely linked to emotions (Banse and Scherer, 1996). Next, the so-called extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS) (Eyben et al, 2016), which has been widely used in many recent emotion recognition challenges (e.g., Valstar, 2016;Ringeval et al, 2019;Xue et al, 2019), is also explored and contains a set of 88 acoustic parameters relating to pitch, loudness, unvoiced segments, temporal dynamics, and cepstral features. Lastly, modulation spectral features are explored as they capture second-order periodicities in the speech signal and have been shown to convey emotional information (Wu et al, 2011;Avila et al, 2021).…”
Section: Automatic Speech Recognitionmentioning
confidence: 99%