2010
DOI: 10.3844/jcssp.2010.341.349
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Evaluation of Phone Duration Models for Greek Emotional Speech

Abstract: Problem statement:In this study we cope with the task of phone duration modeling for Greek emotional speech synthesis. Approach: Various well established machine learning techniques are applied for this purpose to an emotional speech database consisting of five archetypal emotions. The constructed phone duration prediction models are built on phonetic, morphosyntactic and prosodic features that can be extracted only from text. We employ model and regression trees, linear regression, lazy learning algorithms an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2014
2014
2018
2018

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…The biggest decrease of RMSE in percentages was obtained for the full phoneme set, whereas the percentage of RMSE decrease for consonants is the smallest. 20.30 0.79 Greek [9] 26.40 0.54 Greek [10] 27.20 0.63 Greek vowels [11] 26.04 -Greek consonants [11] 29.13 -Lithuanian vowels [12] 18.30 0.80 Lithuanian consonants [12] 16.70 0.75 Serbo-Croatian [13] 15.85 0.91 Korean [14] 22.00 0.82 Turkish [15] 20.04 0.78 Hindi [16] 27.14 0.75 Telugu [16] 22.86 0.80…”
Section: Resultsmentioning
confidence: 99%
“…The biggest decrease of RMSE in percentages was obtained for the full phoneme set, whereas the percentage of RMSE decrease for consonants is the smallest. 20.30 0.79 Greek [9] 26.40 0.54 Greek [10] 27.20 0.63 Greek vowels [11] 26.04 -Greek consonants [11] 29.13 -Lithuanian vowels [12] 18.30 0.80 Lithuanian consonants [12] 16.70 0.75 Serbo-Croatian [13] 15.85 0.91 Korean [14] 22.00 0.82 Turkish [15] 20.04 0.78 Hindi [16] 27.14 0.75 Telugu [16] 22.86 0.80…”
Section: Resultsmentioning
confidence: 99%
“…These models require a large corpus of spoken language, because the modeling is done using a machine learning algorithm on large corpora. Various machine learning approaches have been applied for phone duration modeling such as artificial neural networks [3,4] decision trees [5][6][7][8][9][10], Bayesian models [11], and instance-based algorithms [12].…”
Section: Introductionmentioning
confidence: 99%