2023
DOI: 10.3390/s23031380
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Low-Latency DNN-Based Speech Enhancement

Abstract: This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architectures. Specifically, the causal units used in deep neural networks are presented and discussed in the context of their properties, such as the number of parameters, the receptive field, and computational complexity. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 118 publications
0
5
0
Order By: Relevance
“…11,12 These approaches, however, are computationally expensive and not yet well-suited for low-power real-time applications. 13…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…11,12 These approaches, however, are computationally expensive and not yet well-suited for low-power real-time applications. 13…”
Section: Introductionmentioning
confidence: 99%
“…11,12 These approaches, however, are computationally expensive and not yet well-suited for low-power real-time applications. 13 We recently developed a biologically oriented sound segregation algorithm 14 (BOSSA), which is designed to separate competing sounds based on differences in spatial location. Taking its inspiration from binaural auditory system, this algorithm requires only two input signals, and does not sacrifice spatial cues.…”
Section: Introductionmentioning
confidence: 99%
“…Noise reduction can also be regarded as increasing speech intelligibility [2,3]. There has been extensive recent interest in the emerging area of machine learning-based speech enhancement techniques [4][5][6][7]. A particularly important application of speech intelligibility is its use in signal processors for cochlear implants (CIs) used by hearing-impaired listeners [8][9][10].…”
Section: Introductionmentioning
confidence: 99%
“…The low-computational cost is a property desired in all systems, but most crucial for on-device deployment on smartphones or hearing aids. These translate into algorithmic latency and hardware latency that both add up to the overall processing latency [10,11]. The implementation aspects of speech enhancement are important parts of current research.…”
Section: Introductionmentioning
confidence: 99%
“…The frame is predicted based on current and previous inputs and overlap-added with previous predictions for final enhanced speech signal generation. The recent challenges that focus on real-time speech enhancement, such as Interspeech 2020 Deep Noise Suppression (DNS) challenge [16] and Clarity [17], put tight requirements on the maximum processing time and allowed look-ahead limited to respectively 40 ms and 5 ms. A comprehensive survey on recent advancements in low-latency speech enhancement can be found in [11].…”
Section: Introductionmentioning
confidence: 99%