ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682367
|View full text |Cite
|
Sign up to set email alerts
|

Learning Bandwidth Expansion Using Perceptually-motivated Loss

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 16 publications
0
13
0
Order By: Relevance
“…It introduces artifacts specific to SE (speech enhancement), and may also be misaligned. [27]: consists of MOS tests to verify quality of 3 different bandwidth expansion algorithms.…”
Section: Correlation To Mosmentioning
confidence: 99%
“…It introduces artifacts specific to SE (speech enhancement), and may also be misaligned. [27]: consists of MOS tests to verify quality of 3 different bandwidth expansion algorithms.…”
Section: Correlation To Mosmentioning
confidence: 99%
“…WaveNet [23] and its variants for BWE [24,25] use dilated convolutions to enable large receptive field while preserving the original resolution. Feng et al [6] used FFTNet [26] which resembles the classical FFT process. Ling et al [27] proposed a hierarchical RNN to utilize the waveform structures.…”
Section: Related Workmentioning
confidence: 99%
“…The first claim is not obvious because traditional bandwidth extension (BWE) research has focused on lifting narrow-band signals to 16kHz (from 4-8kHz), primarily for telephony. As far as we are aware, the only previous work that extends to as high as 44kHz (with moderate success) is that of Feng et al [6].…”
Section: Introductionmentioning
confidence: 99%
“…We consider an exhaustive set of 10 different datasets for this evaluation. These datasets span over a variety of well-known speech problems; (1) Speech Synthesis (VoCo [65] and FFTnet [68]), (2) Speech Enhancement (Dereverberation [66], Noizeus [71], HiFi-GAN [67]), (3) Voice Conversion (VCC-2018 [70]), (4) Speech Source Separation (PEASS [69]), (5) Telephony Degradations [72], (6) Bandwidth Extension (BWE [73]), and (7) General Degradation's (Simulated [6]). Please refer to supplementary material for details about these datasets.…”
Section: Subjective Evaluationsmentioning
confidence: 99%
“…Name Simulated [6] FFTnet [68] BWE [73] HiFi-GAN [ Tables 1 and 2 shows correlations with MOS on all 8 datasets. Correlations for full-reference SQA methods (PESQ and CDPAM), and non-intrusive DNSMOS are also shown.…”
Section: Subjective Evaluationsmentioning
confidence: 99%