2022
DOI: 10.1121/10.0011396
|View full text |Cite
|
Sign up to set email alerts
|

Low-latency monaural speech enhancement with deep filter-bank equalizer

Abstract: It is highly desirable that speech enhancement algorithms can achieve good performance while keeping low latency for many applications, such as digital hearing aids, mobile phones, acoustically transparent hearing devices, and public address systems. To improve the performance of traditional low-latency speech enhancement algorithms, a deep filter-bank equalizer (FBE) framework was proposed that integrated a deep learning-based subband noise reduction network with a deep learning-based shortened digital filter… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 30 publications
0
5
0
1
Order By: Relevance
“…For hearing aids it is generally believed that the delay should be as small as possible, and some hearing aids have delays as small as 0.5 ms. It is still challenging to reduce the delay to below 4 ms without affecting performance, although some researchers have tried to solve this problem (Vary, 2006; Schröter et al, 2022; Zheng et al, 2022). Tammen & Doclo (2021) proposed a deep multiframe approach to reduce the delay for hearing aids, and this approach was further extended for binaural noise reduction (Tammen & Doclo, 2022).…”
Section: Conclusion and Future Prospectsmentioning
confidence: 99%
See 1 more Smart Citation
“…For hearing aids it is generally believed that the delay should be as small as possible, and some hearing aids have delays as small as 0.5 ms. It is still challenging to reduce the delay to below 4 ms without affecting performance, although some researchers have tried to solve this problem (Vary, 2006; Schröter et al, 2022; Zheng et al, 2022). Tammen & Doclo (2021) proposed a deep multiframe approach to reduce the delay for hearing aids, and this approach was further extended for binaural noise reduction (Tammen & Doclo, 2022).…”
Section: Conclusion and Future Prospectsmentioning
confidence: 99%
“…For the latter, the short-term complex spectrum of the clean speech is estimated, the spectrum is converted back to a time-domain signal, and this process is repeated for a series of overlapping frames (time segments) to reconstruct the complete time-domain signal, using the overlap-add method (Allen, 1977; Boll, 1979; Ephraim & Malah, 1984; Griffin & Lim, 1984; Loizou, 2013; Wang & Chen, 2018). There are some hybrid methods, in which the appropriate gain for each of several frequency sub-bands is estimated in a first stage, and a time-domain enhancement filter is designed in a second stage to partially remove the noise and reverberation (Vary, 2006; Löllmann & Vary, 2007; Zheng et al, 2022). Among these methods, frequency-domain methods have been the most extensively studied, for the following reasons.…”
Section: Introductionmentioning
confidence: 99%
“…где S(t) -чистый сигнал без искажений, H(t) -импульсная характеристика помещения, применяемая для моделирования реверберации, X(t) -сигнал с реверберацией, N(t)аддитивный шум, * -операция свертки. После разбиения сигнала на перекрывающиеся временные окна и быстрого преобразования Фурье, получившееся представление сигнала можно записать следующим образом: [4], адаптивной фильтрации [5]. Однако наибольшее число работ [6][7][8][9][10][11][12][13] было связано c развитием методов маскирования шума.…”
Section: постановка задачиunclassified
“…Machine learning-based speech enhancement has recently shown great potential to improve speech intelligibility in the presence of multi-talker background noise both for normal hearing listeners (Bentsen et al, 2018;Healy et al, 2021b;Zheng et al, 2022;Shankar et al, 2020;Graetzer and Hopkins, 2022) and for hearing aid and cochlear implant users (Goehring et al, 2019;Keshavarzi et al, 2018;Healy et al, 2017;, demonstrating high efficacy and viability (Healy et al, 2023). Moreover, deep learning has advanced the state-of-the-art in various audio processing tasks including speech enhancement (Nas et al, 2021;Thoidis et al, 2020), speech recognition (Nassif et al, 2019), audio tagging (Vrysis et al, 2020;2021), speaker diarisation (Tsipas et al, 2020), and speaker verification (Thoidis et al, 2023).…”
Section: Introductionmentioning
confidence: 99%
“…Previous studies have explored various approaches, including both non-causal methods (Lai et al, 2018;Healy et al, 2015) and causal methods (Goehring et al, 2016;Healy et al, 2023), as well as techniques that incorporate a short-time window in the future (Bramslow et al, 2018), which are termed quasi-causal. Some studies have focused on experimenting with smaller networks that are suitable for real-time implementation in current systems (Zheng et al, 2022;Healy et al, 2021b;Bentsen et al, 2018), while others have utilized larger networks to showcase the potential of these methods (Healy et al, 2023;2021b). Although the latter approach poses limitations to integrating deep learning algorithms into hearing technology, the continuous advancement in processing capabilities of modern computing systems is gradually enabling the deployment of larger and more complex machine learning methods in real-world settings.…”
Section: Introductionmentioning
confidence: 99%