2018
DOI: 10.29007/s4t1
|View full text |Cite
|
Sign up to set email alerts
|

sprocket: Open-Source Voice Conversion Software

Abstract: Statistical voice conversion (VC) is a technique to convert specific non-or paralinguistic information while keeping linguistic information unchanged, and speaker conversion has been studied as a typical application of VC for a few decades. To better understand various VC techniques using a freely available common dataset, the Voice Conversion Challenge (VCC) was launched in 2016 and the 2nd challenge was held in 2018. As one of the baseline systems for VCC 2018, we developed open-source VC software called "sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
37
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
1

Relationship

4
4

Authors

Journals

citations
Cited by 37 publications
(38 citation statements)
references
References 22 publications
1
37
0
Order By: Relevance
“…The results of using mel-cepstral distortion to evaluate the spectral conversion module are given in the objective evaluation results. An internal subjective evaluation was conducted to assess the performance of the NU VC system with the provided baseline system, i.e., "sprocket" [27], where the results are given in the internal subjective evaluation section. Finally, the last three sections describe the official results of the subjective evaluation in VCC 2018.…”
Section: Experimental Conditionsmentioning
confidence: 99%
“…The results of using mel-cepstral distortion to evaluate the spectral conversion module are given in the objective evaluation results. An internal subjective evaluation was conducted to assess the performance of the NU VC system with the provided baseline system, i.e., "sprocket" [27], where the results are given in the internal subjective evaluation section. Finally, the last three sections describe the official results of the subjective evaluation in VCC 2018.…”
Section: Experimental Conditionsmentioning
confidence: 99%
“…The GMM is trained using joint vectors of X i and Y i in the parallel data set, which have been automatically aligned to each other by dynamic time warping (DTW) [4]. The detailed steps can be found in References [2,25]. DIFFGMM is a differential Gaussian mixture model.…”
Section: Gmm-based Vcmentioning
confidence: 99%
“…For a more detailed conversion process of VC and DIFFGMM and the training process of parallel VC based on GMM, please refer to Figures 1 and 2 in Kobayashi and Toda [25].…”
Section: Gmm-based Vcmentioning
confidence: 99%
“…An internal subjective evaluation was conducted to assess the performance of the NU VC system with the provided baseline system, i.e., "sprocket" [27], where the results are given in the internal subjective evaluation section. Finally, the last three sections describe the official results of the subjective evaluation in VCC 2018.…”
Section: Experimental Conditionsmentioning
confidence: 99%
“…In the internal subjective evaluation, two preference tests (naturalness and speaker similarity) were conducted to compare the performance of the NU VC system with that of the baseline system, i.e., sprocket [27]. All 16 speaker pair models for the four source and four target speakers were used in the evaluation.…”
Section: Internal Subjective Evaluationmentioning
confidence: 99%