2018
DOI: 10.1016/j.csl.2017.10.004
|View full text |Cite
|
Sign up to set email alerts
|

Predicting speech intelligibility with deep neural networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

2
54
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 72 publications
(56 citation statements)
references
References 33 publications
2
54
0
Order By: Relevance
“…However, a number of works have attempted to partly or fully predict intelligibility using data-driven methods. One approach in this direction has been to use an Automatic Speech Recognition (ASR) system to transcribe degraded sentences, using the error rate as a measure of intelligibility [48], [49]. Another data-driven approach has been to non-intrusively estimate the output of an intrusive SIP algorithm [50], [51], [52], [53].…”
Section: Introductionmentioning
confidence: 99%
“…However, a number of works have attempted to partly or fully predict intelligibility using data-driven methods. One approach in this direction has been to use an Automatic Speech Recognition (ASR) system to transcribe degraded sentences, using the error rate as a measure of intelligibility [48], [49]. Another data-driven approach has been to non-intrusively estimate the output of an intrusive SIP algorithm [50], [51], [52], [53].…”
Section: Introductionmentioning
confidence: 99%
“…Despite the breakthroughs of neural networks in so many areas, to date, only a handful of neural network-based models have been proposed [9,10,11]. To the best of our knowledge, even the most recent methods to predict MOS present serious limitations.…”
Section: Introductionmentioning
confidence: 99%
“…To the best of our knowledge, even the most recent methods to predict MOS present serious limitations. First, most of them are developed to measure intelligibility [10], which is just one aspect of audio quality [3]. Second, these neural network-based models are trained on a limited number of conditions, usually with no interaction between different impairments, which is quite unrealistic and rarely happens in everyday scenarios.…”
Section: Introductionmentioning
confidence: 99%
“…In [3], the speech intelligibility is predicted using convolutional neural network which is trained with measured intelligibility scores that humans listen and evaluate. The work in [4] presented the method of speech intelligibility prediction by using automatic speech recognition (ASR) system based deep neural networks. Recently, the non-intrusive speech intelligibility estimation method based on a recurrent neural network (RNN) with a mel-frequency cepstrum coefficient (MFCC) vector was proposed [5].…”
Section: Introductionmentioning
confidence: 99%