2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019
DOI: 10.1109/apsipaasc47483.2019.9023132
|View full text |Cite
|
Sign up to set email alerts
|

Speech Loss Compensation by Generative Adversarial Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(5 citation statements)
references
References 21 publications
0
5
0
Order By: Relevance
“…In particular, we assume that information about packet losses is not available as is customary in typical PCL settings. A notable exception to this is audio inpainting, which can be applied without knowing the position of the missing audio (blind audio inpainting) [16,20], but these works do not use data augmentation and speech recognition for evaluation. Despite the difficulties in making comparisons, we have found that our results are in line with a few recently published works.…”
Section: Word Errormentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, we assume that information about packet losses is not available as is customary in typical PCL settings. A notable exception to this is audio inpainting, which can be applied without knowing the position of the missing audio (blind audio inpainting) [16,20], but these works do not use data augmentation and speech recognition for evaluation. Despite the difficulties in making comparisons, we have found that our results are in line with a few recently published works.…”
Section: Word Errormentioning
confidence: 99%
“…Deep learning techniques have proved to be able to extract higher level features from less processed data and to be capable of modeling, predicting, and generalizing better than classical (statistical) techniques. PLC has not been an exception, and several approaches have been proposed in the last few years for deep learning PLC based on regression either in waveform or time-frequency domains [13][14][15][16][17], based on autoencoders [18,19] and based on Generative Adversarial Networks (GANs) [20][21][22][23].…”
Section: Introductionmentioning
confidence: 99%
“…One study created training samples by adding noise with different signal-to-noise (SNR) levels on clean speech signals [41]. Another work on speech bandwidth enhancement included input training samples that are created using low-pass filters with different cut-off frequencies [42]. In all aforementioned examples, the samples that illustrate multiple conditions are perceptually different.…”
Section: A Motivationmentioning
confidence: 99%
“…The performance of the different networks trained to fill in speech gaps is compared with a state of the art SEGAN baseline [28] trained on our data in Table 3. T-UNet outperforms the baseline, while TF-UNet does not.…”
Section: Gaps In Speechmentioning
confidence: 99%