2016
DOI: 10.1121/1.4964505
|View full text |Cite
|
Sign up to set email alerts
|

Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain

Abstract: A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [mr-sEPSM; Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134(1), 436–446] with a correlation back end inspired by the short-time objective intelligibility measure [STOI; Taal, Hendriks, Heusdens, and Jensen (2011). IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136]. This “hybrid” model, named sEPSMcorr, is shown to account for th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
62
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 49 publications
(64 citation statements)
references
References 35 publications
2
62
0
Order By: Relevance
“…A more detailed description, including all equations and the theoretical motivation for each step, is provided in Sec. II in Relaño-Iborra et al (2016).…”
Section: Sepsm Corr 2 Model Descriptionmentioning
confidence: 99%
See 2 more Smart Citations
“…A more detailed description, including all equations and the theoretical motivation for each step, is provided in Sec. II in Relaño-Iborra et al (2016).…”
Section: Sepsm Corr 2 Model Descriptionmentioning
confidence: 99%
“…Modulation-based models, on the other hand, date back to the speech transmission index (STI; Steeneken and Houtgast, 1980) and also include the more recent modulation filterbank models, for example, the multi-resolution speech-based envelope power spectrum model (mr-sEPSM; Jørgensen et al, 2013). In addition, it is useful to distinguish between models whose decision stage (back end) evaluates energetic differences between the input signals (e.g., ESII and mr-sEPSM) and those that are based on signal correlations, such as the short-time objective intelligibility measure (STOI; Taal et al, 2011) and the correlation-based version of the mr-sEPSM (sEPSM corr ; Relaño-Iborra et al, 2016).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The speech-based envelope power spectrum model forms the basis of three intelligibility metrics: sEPSM [24], mr-sEPSM [25], and sEPSM corr [26]. All of the sEPSM metrics use the Hilbert transform and a gammatone filterbank to extract temporal envelopes for different frequency bands.…”
Section: Speech-based Envelope Power Spectrum Model With Short-timmentioning
confidence: 99%
“…Intrusive intelligibility metrics require knowledge of the clean speech and either the communication channel or degraded speech, whereas non-intrusive intelligibility metrics require only the degraded speech. In this paper we develop a new intrusive intelligibility metric based on information theory [2].Existing intrusive intelligibility metrics include the speech intelligibility index (SII) [3], the speech transmission index (STI) [4], the coherence SII (CSII) [5], the extended SII (ESII) [6], the normalized covariance measure (NCM) [7], [8], the hearing-aid speech perception index (HASPI) [9], the shorttime objective intelligibility measure (STOI) [10], the extended STOI (ESTOI) [11], the speech-based envelope power spectrum model (sEPSM) [12]- [14], and the glimpse proportion metric (GP) [15]- [17]. As a group, the above algorithms have been successful at predicting speech intelligibility in a wide-range of conditions including additive noise, filtering, reverberation, and non-linear enhancement.…”
mentioning
confidence: 99%