Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1850
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional Neural Network-Based Speech Enhancement for Cochlear Implant Recipients

Abstract: Attempts to develop speech enhancement algorithms with improved speech intelligibility for cochlear implant (CI) users have met with limited success. To improve speech enhancement methods for CI users, we propose to perform speech enhancement in a cochlear filter-bank feature space, a feature-set specifically designed for CI users based on CI auditory stimuli. We leverage a convolutional neural network (CNN) to extract both stationary and non-stationary components of environmental acoustics and speech. We prop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
27
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
3
1

Relationship

2
8

Authors

Journals

citations
Cited by 34 publications
(27 citation statements)
references
References 34 publications
0
27
0
Order By: Relevance
“…Lately, there has been increasing interest in nonlinear models, specifically, Deep Neural Networks (DNNs) [21,22,23,24]. In Deep Clustering (DPCL) [25,26], first, the timefrequency bins of the mixtures are mapped into an embedding space; then, a clustering algorithm is performed in the embedding space; finally, a binary mask is generated based on each cluster to reconstruct speech of each speaker.…”
Section: Introductionmentioning
confidence: 99%
“…Lately, there has been increasing interest in nonlinear models, specifically, Deep Neural Networks (DNNs) [21,22,23,24]. In Deep Clustering (DPCL) [25,26], first, the timefrequency bins of the mixtures are mapped into an embedding space; then, a clustering algorithm is performed in the embedding space; finally, a binary mask is generated based on each cluster to reconstruct speech of each speaker.…”
Section: Introductionmentioning
confidence: 99%
“…In addition to unsupervised SE approaches, numerous machine-learning-based algorithms have been used in the single-channel SE field. For these approaches, a denoising model is often prepared in a data-driven manner without imposing strong statistical assumptions on the clean speech and noise signals, and the noisy speech signal is processed by the denoising model to extract the clean speech signal [43][44][45][46][47][48][49]. Notable examples include non-negative matrix factorization [50], compressive sensing [51], and sparse coding [52].…”
Section: Improving the Intelligibility Of Speech For Simulated Electrmentioning
confidence: 99%
“…Generally, the networks are designed based on either the classi-This work was supported by JD-BAAI joint project. cal building blocks for DNN such as feed-forward network (FNN) [12][13][14], convolutional neural network (CNN) [15][16][17], recurrent neural network (RNN) [18][19][20], or concatenation of these building blocks such as UNet [21][22][23] and convolutional recurrent neural network (CRNN) [24,25]. Although the architectures are designed to effectively model the different time-frequency dependencies of speech and noise, there always lacks an explicit criterion for the model design, which makes it hard to interpret and optimize the intermediate representations, and also makes the performance highly rely on the diversity of the training data.…”
Section: Introductionmentioning
confidence: 99%