2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011
DOI: 10.1109/icassp.2011.5947404
|View full text |Cite
|
Sign up to set email alerts
|

An optimization algorithm of independent mean and variance parameter tying structures for HMM-based speech synthesis

Abstract: This paper proposes a technique for constructing independent parameter tying structures of mean and variance in HMMbased speech synthesis. Conventionally, mean and variance parameters are assumed to have the same tying structure. However, it has been reported that a clustering technique of mean vectors while tying all variance matrices improves the quality of synthesized speech. This indicates that mean and variance parameters should have different optimal tying structures. In the proposed technique, the decis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2016
2016

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 9 publications
(7 reference statements)
0
2
0
Order By: Relevance
“…The MOS for natural speech was about 4.7, so none of the systems achieved similar naturalness to the human voice. In addition, speech synthesized by Japanese methods developed within the past two years was evaluated [10], [11]. In these tests, none of the systems achieved similar naturalness to the human voice.…”
Section: Naturalness Of Conventionalmentioning
confidence: 99%
“…The MOS for natural speech was about 4.7, so none of the systems achieved similar naturalness to the human voice. In addition, speech synthesized by Japanese methods developed within the past two years was evaluated [10], [11]. In these tests, none of the systems achieved similar naturalness to the human voice.…”
Section: Naturalness Of Conventionalmentioning
confidence: 99%
“…This method has several advantages as it is easy to use for voice conversion, has good performance with small speech databases, and does not require a high-performance Central Processing Unit (CPU) or large memory. However, the naturalness of the speech synthesized using this method is not so high (Zen, 2008;Takaki, 2011;Nose, 2013).…”
Section: Introductionmentioning
confidence: 99%