10th ISCA Workshop on Speech Synthesis (SSW 10) 2019
DOI: 10.21437/ssw.2019-16
|View full text |Cite
|
Sign up to set email alerts
|

Novel Inception-GAN for Whispered-to-Normal Speech Conversion

Abstract: Recently, Convolutional Neural Networks (CNN)-based Generative Adversarial Networks (GANs) are used for Whisper-to-Normal Speech (i.e., WHSP2SPCH) conversion task. These CNN-based GANs are significantly difficult to train in terms of computational complexity. Goal of the generator in GAN is to map the features of the whispered speech to that of the normal speech efficiently. To improve the performance, we need to either tune the cost functions by changing hyperparameters associated with it or to make the gener… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 25 publications
0
8
0
Order By: Relevance
“…One of these, namely Lian et al (2020), was excluded from the survey due to the lack of availability of a full text version. Additionally, following the same criteria, 3 other papers were collected through Web of Science and/or Google Scholar, namely Niranjan et al (2020), Gao et al (2021), andPatel et al (2019). It is also worth noting that, although beyond the scope of this survey, the prior research in whisper-to-normal speech conversion is largely covered by the background sections within the selected papers.…”
Section: Paper Selectionmentioning
confidence: 99%
See 3 more Smart Citations
“…One of these, namely Lian et al (2020), was excluded from the survey due to the lack of availability of a full text version. Additionally, following the same criteria, 3 other papers were collected through Web of Science and/or Google Scholar, namely Niranjan et al (2020), Gao et al (2021), andPatel et al (2019). It is also worth noting that, although beyond the scope of this survey, the prior research in whisper-to-normal speech conversion is largely covered by the background sections within the selected papers.…”
Section: Paper Selectionmentioning
confidence: 99%
“…In Patel et al (2019), the authors adapted the Inception modules from Szegedy et al (2015), proposing an Inception-GAN architecture, aimed at reducing computational complexity. In contrast with a typical CNN based GAN, also implemented as baseline for comparison purposes, the generator and discriminator parts of the GAN are made of stacks of 4 and 3 inception modules, respectively, and only 1 convolutional layer each.…”
Section: Inception-gan Vs Cnn-ganmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, parallel data requires timealignment as pre-processing. In addition, traditional method uses 2-step sequential method for WHSP2SPCH conversion [25], [26]. For CycleGAN based conversion, in first step, one CycleGAN is trained for cepstral feature mapping of whisper to normal speech, and in second step, another CycleGAN is trained for F 0 prediction, which heavily relies on previously trained CycleGAN [25].…”
Section: Introductionmentioning
confidence: 99%