2014
DOI: 10.1121/1.4869088
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of the importance of time-frequency contributions to speech intelligibility in noise

Abstract: Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
8
0

Year Published

2015
2015
2016
2016

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 34 publications
0
8
0
Order By: Relevance
“…Based on the assumption that false positives and false negatives act independently, Li and Loizou (2008) argue that false positives are substantially more detrimental to intelligibility than false negatives. Yu et al (2014) support this claim by showing that uniformly random false positive errors are more harmful to intelligibility than uniformly random false negative errors when the input signal-to-noise ratio (SNR) of the mixture is less than or equal to 0 dB.…”
Section: Introductionmentioning
confidence: 72%
“…Based on the assumption that false positives and false negatives act independently, Li and Loizou (2008) argue that false positives are substantially more detrimental to intelligibility than false negatives. Yu et al (2014) support this claim by showing that uniformly random false positive errors are more harmful to intelligibility than uniformly random false negative errors when the input signal-to-noise ratio (SNR) of the mixture is less than or equal to 0 dB.…”
Section: Introductionmentioning
confidence: 72%
“…5 and STOI scores in Table I afford an opportunity to assess the accuracy of intelligibility prediction. STOI has been shown to be more accurate than many alternative metrics, such as the classic speech intelligibility index, and has become a standard objective speech intelligibility metric (Taal et al, 2011;Yu et al, 2014) for NH listeners. For the IEEE corpus with the same male speaker employed here, Taal et al (2011) provide parameter values of a logistic transfer function that maps STOI scores to percent-correct numbers.…”
Section: Discussionmentioning
confidence: 99%
“…Since a higher H-FA score does not necessarily produce a higher intelligibility score, H-FA may also be unfit as a cost function for algorithm design. Yu et al (2014) tried to address some of the limitations of H-FA when they proposed the loudness-weighted H-FA, a mask-based metric that takes into account the relative importance of each error. However, the importance weights in the metric were fit to masks that employ an alternate definition of the IBM (i.e., the "target binary mask"; Kjems et al, 2009) and that use an FFT-based frequency decomposition.…”
Section: Discussionmentioning
confidence: 99%
“…Given that STOI evaluates the reconstructed signals as a whole rather than each classification decision independently, it holds promise for being able to predict outcomes for masks with different error distributions because it can take into account the perceptual relevance of the errors. Several other metrics have been proposed to assess sound source separation algorithms, such as the loudnessweighted H-FA (Yu et al, 2014), the IBM ratio (Hummersone et al, 2011), and the intelligibility metric based on an auditory preprocessing model (Christiansen et al, 2010). However, these metrics have gained limited traction due to either lacking generalizability or accessibility.…”
Section: Introductionmentioning
confidence: 99%