2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472934
|View full text |Cite
|
Sign up to set email alerts
|

DNN-based enhancement of noisy and reverberant speech

Abstract: In the real world, speech is usually distorted by both reverberation and background noise. In such conditions, speech intelligibility is degraded substantially, especially for hearing-impaired (HI) listeners. As a consequence, it is essential to enhance speech in the noisy and reverberant environment. Recently, deep neural networks have been introduced to learn a spectral mapping to enhance corrupted speech, and shown significant improvements in objective metrics and automatic speech recognition score. However… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
33
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 69 publications
(33 citation statements)
references
References 23 publications
0
33
0
Order By: Relevance
“…In this work, we use masking based source separation paradigm (e.g., used in [4,25,26]) where a DNN is used to predict a time-frequency mask corresponding to a target speaker. The ESTOI computation however is done for estimated and reference source spectra.…”
Section: Implicit Time-frequency Maskingmentioning
confidence: 99%
“…In this work, we use masking based source separation paradigm (e.g., used in [4,25,26]) where a DNN is used to predict a time-frequency mask corresponding to a target speaker. The ESTOI computation however is done for estimated and reference source spectra.…”
Section: Implicit Time-frequency Maskingmentioning
confidence: 99%
“…The activation functions for the T-F mask φ g , variance φ σ , and hidden units φ h were the sigmoid function, exponential function, and rectified linear unit (ReLU), respectively. The context window size was Q = 5, and the variance regularization parameter in (15) was C σ = 10 −41 . The Adam method [47] was used as a gradient method.…”
Section: Methodsmentioning
confidence: 99%
“…Secondly, a sampling method needs to be defined for eqns. (9) and (11), that incorporates both clean and noisy inputs. To do this, a pair of corresponding clean and noisy spectra sequences, z, are sampled, from which G generates a fake samplex t .…”
Section: Discriminator Architecturementioning
confidence: 99%