ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413575
|View full text |Cite
|
Sign up to set email alerts
|

Bandwidth Extension is All You Need

Abstract: Speech generation and enhancement have seen recent breakthroughs in quality thanks to deep learning. These methods typically operate at a limited sampling rate of 16-22kHz due to computational complexity and available datasets. This limitation imposes a gap between the output of such methods and that of high-fidelity (≥44kHz) real-world audio applications. This paper proposes a new bandwidth extension (BWE) method that expands 8-16kHz speech signals to 48kHz. The method is based on a feed-forward WaveNet archi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 36 publications
(18 citation statements)
references
References 29 publications
0
16
0
Order By: Relevance
“…Although only a few studies have applied GAN models to bandwidth extension of music signals [10], [39], many recent works have applied them for speech [14], [13], [40]. Eskimez et al [40] proposed one of the earliest works using an adversarial approach for speech super-resolution.…”
Section: B Gans For Audio Bandwidth Extensionmentioning
confidence: 99%
See 4 more Smart Citations
“…Although only a few studies have applied GAN models to bandwidth extension of music signals [10], [39], many recent works have applied them for speech [14], [13], [40]. Eskimez et al [40] proposed one of the earliest works using an adversarial approach for speech super-resolution.…”
Section: B Gans For Audio Bandwidth Extensionmentioning
confidence: 99%
“…Moreover, timedomain discriminators help keep the quasi-periodic structure of raw audio, generating a more naturally sounding harmonic series. We also experimented incorporating STFT-based discriminators, following the methodology suggested by Su et al [13], but we noticed a considerable decline in the generated audio quality. The STFT discriminators force the generator to inpaint harmonic-like shapes in the spectrogram without taking into account their coherence with the lower part of the spectrum, thus generating metallic and unrealistic sounds.…”
Section: B Training Objectivementioning
confidence: 99%
See 3 more Smart Citations