2020
DOI: 10.48550/arxiv.2001.10601
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…The same model parameters were used for both the challenge 16-kHz evaluation and our own 48-kHz VCTK evaluation, demonstrating the capability to operate on speech with different bandwidths. The quality also exceeds that of the baseline [29] algorithm.…”
Section: Experiments and Resultsmentioning
confidence: 92%
“…The same model parameters were used for both the challenge 16-kHz evaluation and our own 48-kHz VCTK evaluation, demonstrating the capability to operate on speech with different bandwidths. The quality also exceeds that of the baseline [29] algorithm.…”
Section: Experiments and Resultsmentioning
confidence: 92%
“…The n-th echo has a delay of nτ + jitter and a gain of ρ n λ. N and ρ are chosen so that when the total delay reaches RT60, we have ρ N ≤ 1e−3. λ, τ and RT60 are sampled uniformly respectively over [0, 0.3], [10,30] ms, [0.3, 1.3] sec.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…While considering causal methods, the authors in [46] propose a convolutional recurrent network at the spectral level for real-time speech enhancement, while Xia, Yangyang, et al [30] suggest to remove the convolutional layers and apply a weighted loss function to further improve results in the real-time setup. Recently, the authors in [23] provide impressive results for both causal and non-causal models using a minimum mean-square error noise power spectral density tracker, which employs a temporal convolutional network (TCN) a priori SNR estimator.…”
Section: Related Workmentioning
confidence: 99%
“…2 of 20 [15]. Recent innovations have introduced transformer-based models such as DPTNet [16], enhancing context-aware modeling, alongside breakthroughs in noise suppression with NSNet [17] and NSNet2 [18]. These DNN solutions have showcased a marked improvement over their traditional counterparts, particularly in environments dominated by low-SNR non-stationary noises.…”
Section: Introductionmentioning
confidence: 99%
“…Building on the foundation laid by NSNet [17,18], this study shifts focus towards innovating training methodologies for RNN-based speech enhancement models suited for MCUs. Our research investigates training methodologies, aiming to optimize model performance from the start rather than adjusting post-training.…”
Section: Introductionmentioning
confidence: 99%