2021
DOI: 10.1142/s0218126621501887
|View full text |Cite
|
Sign up to set email alerts
|

High-Quality Many-to-Many Voice Conversion Using Transitive Star Generative Adversarial Networks with Adaptive Instance Normalization

Abstract: This paper proposes a novel high-quality nonparallel many-to-many voice conversion method based on transitive star generative adversarial networks with adaptive instance normalization (Trans-StarGAN-VC with AdaIN). First, we improve the structure of generator with TransNets to make full use of hierarchical features associated with speech naturalness. In TransNets, many shortcut connections share hierarchical features between encoding and decoding part to capture sufficient linguistic and semantic information, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…Because GANs have good learning ability and the ability to simulate data distribution, they have been widely concerned in the field of machine learning. They show excellent performance in image generation [12], image translation [13], image enhancement [14], speech generation [15] and speech conversion [16]. GANs are composed of two neural networks: generator and discriminator.…”
Section: Introductionmentioning
confidence: 99%
“…Because GANs have good learning ability and the ability to simulate data distribution, they have been widely concerned in the field of machine learning. They show excellent performance in image generation [12], image translation [13], image enhancement [14], speech generation [15] and speech conversion [16]. GANs are composed of two neural networks: generator and discriminator.…”
Section: Introductionmentioning
confidence: 99%
“…2. Strengthen the use of pixel information around the image[31]. The advantages of IN are as follows: all elements of a single sample and a single channel are considered when calculating the normalized statistics.…”
mentioning
confidence: 99%