2022
DOI: 10.1007/s00034-022-01998-5
|View full text |Cite
|
Sign up to set email alerts
|

Non-parallel Voice Conversion Based on Perceptual Star Generative Adversarial Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 42 publications
0
4
0
Order By: Relevance
“…Non-parallel VC techniques are even more challenging because they do not need parallel data for training. Some successful non-parallel VC methods include variational autoencoder (VAE) [21,27], generative adversarial network (GAN) [22] and its variants such as CycleGAN [17] and StarGAN [18]. Although these methods have focused on transforming a non-parallel corpus into a quasi-parallel corpus and then on learning a conversion function (which is not so straightforward), they can lead to a degradation of speech quality.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Non-parallel VC techniques are even more challenging because they do not need parallel data for training. Some successful non-parallel VC methods include variational autoencoder (VAE) [21,27], generative adversarial network (GAN) [22] and its variants such as CycleGAN [17] and StarGAN [18]. Although these methods have focused on transforming a non-parallel corpus into a quasi-parallel corpus and then on learning a conversion function (which is not so straightforward), they can lead to a degradation of speech quality.…”
Section: Related Workmentioning
confidence: 99%
“…Successful techniques have been developed, such as those in [12][13][14][15]. For example, approaches including CyleGAN-VC [16,17], StarGAN-VC [18,19] and VAW-GAN [20,21], have employed generative adversarial networks (GANs) [22] to improve both speech quality and similarity to the target speaker, particularly when a large amount of speech data is available. Other approaches, introduced in [13,15], use Seq2Seq models and aim to separate linguistic features from speaker identity components.…”
Section: Introductionmentioning
confidence: 99%
“…Non-parallel VC techniques are even more challenging because they do not need parallel data for training. Some successful non-parallel VC methods include variational autoencoder (VAE) [21,27], generative adversarial network (GAN) [22] and its variants such as CycleGAN [17] and StarGAN [18]. Although these methods have focused on transforming a non-parallel corpus into a quasi-parallel corpus and then on learning a conversion function (which is not so straightforward), they can lead to a degradation of speech quality.…”
Section: Related Workmentioning
confidence: 99%
“…Successful techniques have been developed, such as those in [12][13][14][15]. For example, approaches including CyleGAN-VC [16,17], StarGAN-VC [18,19] and VAW-GAN [20,21], have employed generative adversarial networks (GANs) [22] to improve both speech quality and similarity to the target speaker, particularly when a large amount of speech data is available. Other approaches, introduced in [13,15], use Seq2Seq models and aim to separate linguistic features from speaker identity components.…”
Section: Introductionmentioning
confidence: 99%