2021
DOI: 10.1121/10.0006565
|View full text |Cite
|
Sign up to set email alerts
|

Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility

Abstract: The practical efficacy of deep learning based speaker separation and/or dereverberation hinges on its ability to generalize to conditions not employed during neural network training. The current study was designed to assess the ability to generalize across extremely different training versus test environments. Training and testing were performed using different languages having no known common ancestry and correspondingly large linguistic differences—English for training and Mandarin for testing. Additional ge… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…Non-spatial (single microphone) noise reduction algorithms employed in hearing aids have so far not been able to provide improvements in speech intelligibility 6 , 10 – 15 . A few recent studies have shown that deep learning-based denoising 16 18 or separation of multiple competing speakers 19 , 20 can provide improvements in speech intelligibility for cochlear implant users 21 and hearing aid users with severe-profound 22 hearing loss under fixed signal-to-noise (SNR) conditions 16 18 . For the majority of hearing aid users, with less severe hearing loss 22 , it is more challenging to provide intelligibility improvements through denoising.…”
Section: Introductionmentioning
confidence: 99%
“…Non-spatial (single microphone) noise reduction algorithms employed in hearing aids have so far not been able to provide improvements in speech intelligibility 6 , 10 – 15 . A few recent studies have shown that deep learning-based denoising 16 18 or separation of multiple competing speakers 19 , 20 can provide improvements in speech intelligibility for cochlear implant users 21 and hearing aid users with severe-profound 22 hearing loss under fixed signal-to-noise (SNR) conditions 16 18 . For the majority of hearing aid users, with less severe hearing loss 22 , it is more challenging to provide intelligibility improvements through denoising.…”
Section: Introductionmentioning
confidence: 99%
“…This is an important consideration for deep learning algorithms, which can overfit to training or within-domain data (Pandey and Wang, 2020;Rehr and Gerkmann, 2021). These findings are promising, though further work is required to assess whether different languages (Healy et al, 2021), microphone configurations and/or spatial arrangements between the listener and target speaker would also produce similar results. An advantage of deep-learning approaches is that, by choosing the variability of acoustic scenarios included in the training data, the algorithms can be fine-tuned to specific scenarios or made robust to a wide range of scenarios, albeit with some expected drop in performance over scenario-specific training.…”
Section: Discussionmentioning
confidence: 88%
“…However, further exploration of different languages, in particular tonal languages, remains to be investigated. Related research showed that such generalization can be achieved by deep learning algorithms (Healy et al, 2021c).…”
Section: Discussionmentioning
confidence: 99%