Deep Adaptation Control for Acoustic Echo Cancellation

Ivry, Amir; Cohen, Israel; Berdugo, Baruch

doi:10.1109/icassp43922.2022.9746557

Cited by 12 publications

(1 citation statement)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In particular, the approximation of the interference PSD of the Kalman filter model, capturing nearend speech and noise, by non-negative dictionaries [19] and Deep Neural Networks (DNNs) [20] has shown significant performance improvements relative to traditional, i.e., nontrainable, PSD estimators. Besides the support of traditional step-size estimators by trainable PSD models, machine learning has also been used to directly approximate optimum step-sizes of a time-domain Normalized Least-Mean-Squares (NLMS) algorithm [21] or Short-Time Fourier Transform (STFT)-domain recursive least squares algorithm [22]. Yet, despite significant performance improvements, it remains unclear whether machine learning-supported approximation of target step-sizes, e.g., a Kalman filter step-size with oracle statistics, is optimum w.r.t.…”

Section: Introductionmentioning

confidence: 99%

End-To-End Deep Learning-Based Adaptation Control for Frequency-Domain Adaptive System Identification

Haubner

Brendel

Kellermann

2022

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

The attenuation of acoustic loudspeaker echoes remains to be one of the open challenges to achieve pleasant full-duplex hands free speech communication. In many modern signal enhancement interfaces, this problem is addressed by a linear acoustic echo canceler which subtracts a loudspeaker echo estimate from the recorded microphone signal. To obtain precise echo estimates, the parameters of the echo canceler, i.e., the filter coefficients, need to be estimated quickly and precisely from the observed loudspeaker and microphone signals. For this a sophisticated adaptation control is required to deal with high-power double-talk and rapidly track time-varying acoustic environments which are often faced with portable devices. In this paper, we address this problem by end-to-end deep learning. In particular, we suggest to infer the step-size for a least mean squares frequencydomain adaptive filter update by a Deep Neural Network (DNN). Two different step-size inference approaches are investigated. On the one hand broadband approaches, which use a single DNN to jointly infer step-sizes for all frequency bands, and on the other hand narrowband methods, which exploit individual DNNs per frequency band. The discussion of benefits and disadvantages of both approaches leads to a novel hybrid approach which shows improved echo cancellation while requiring only small DNN architectures. Furthermore, we investigate the effect of different loss functions, signal feature vectors, and DNN output layer architectures on the echo cancellation performance from which we obtain valuable insights into the general design and functionality of DNN-based adaptation control algorithms.

show abstract

Section: Introductionmentioning

confidence: 99%