Prosodic Phrases and Semantic Accents in Speech Corpus for Czech TTS Synthesis

Romportl, Jan

doi:10.1007/978-3-540-87391-4_63

Cited by 8 publications

(3 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Emotional TTS synthesis will be evaluated separately from the whole dialogue system by listening tests. These listening tests will follow the scheme we have developed for prosodic phrase and semantic accent annotation, including statistical modeling of the results using the maximum likelihood approach [8]. The overall naturalness of audiovisual experience resulting from the TTS and avatar activity can be measured only indirectly by intersubjective assessments of testing users from among seniors.…”

Section: Discussionmentioning

confidence: 99%

Audiovisual interface for Czech spoken dialogue system

Ircing

Romportl

Loose

2010

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS

Self Cite

View full text Add to dashboard Cite

Abstract-Our paper introduces implementation details of the application that serves as an audiovisual interface to the automatic dialogue system. It comprises a state-of-the-art large vocabulary continuous speech recognition engine and a TTS system coupled with an embodied avatar that is able to some extent convey a range of emotions to the user. The interface was originally designed for the dialogue system that allows elderly users to reminiscence about their photographs. However, the modular architecture of the whole system and the flexibility of messages that are used for communication between the modules facilitate seamless transition of the application to any domain of the dialogue.

show abstract

Section: Discussionmentioning

confidence: 99%

Audiovisual interface for Czech spoken dialogue system

Ircing

Romportl

Loose

2010

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS

Self Cite

View full text Add to dashboard Cite

show abstract

“…It is an asymptotically consistent, asymptotically normal and asymptotically efficient estimate. This approach was also successfully used in other works regarding speech synthesis research, see [46].…”

Section: Objective Annotationmentioning

confidence: 97%

Dialogue act based expressive speech synthesis in limited domain for the Czech language

Grůber

Matoušek

Hanzlíček

et al. 2020

IJCAI

View full text Add to dashboard Cite

This paper deals with expressive speech synthesis in a dialogue. Dialogue acts -discrete expressive categories -are used for expressivity description. The aim of the work is to create a procedure for development of expressive speech synthesis for a dialogue system in a limited domain. The domain is here limited to dialogues between a human and a computer on a given topic of reminiscing about personal photographs. To incorporate expressivity into synthetic speech, modifications of current algorithms used for neutral speech synthesis are made. An expressive speech corpus is recorded, annotated using a predefined set of dialogue acts, and its acoustic analysis is performed. Unit selection and HMM-based methods are used to synthesize expressive speech, and an evaluation using listening tests is presented. The listeners asses two basic aspects of synthetic expressive speech for isolated utterances: speech quality and expressivity perception. The evaluation is also performed for utterances in a dialogue to asses appropriateness of synthetic expressive speech. It can be concluded that synthetic expressive speech is rated positively even though it is of worse quality when comparing with the neutral speech synthesis. However, synthetic expressive speech is able to transmit expressivity to listeners and to improve the naturalness of the synthetic speech.Povzetek: Razvita je metoda za izrazno govorno sintezo včeščini.

show abstract

“…It is an asymptotically consistent, asymptotically normal and asymptotically efficient estimate. We have also successfully used this approach in recent works regarding speech synthesis research, see [8].…”

Section: Objective Annotationmentioning

confidence: 99%

Listening-Test-Based Annotation of Communicative Functions for Expressive Speech Synthesis

Grůber

Matouýek

2010

Text, Speech and Dialogue

View full text Add to dashboard Cite

Abstract. This paper is focused on the evaluation of listening test that was realized with a view to objectively annotate expressive speech recordings and further develop a limited domain expressive speech synthesis system. There are two main issues to face in this task. The first matter in issue to be taken into consideration is the fact that expressivity in speech has to be defined in some way. The second problem is that perception of expressive speech is a subjective question. However, for the purposes of expressive speech synthesis using unit selection algorithms, the expressive speech corpus has to be objectively and unambiguously annotated. At first, a classification of expressivity was determined making use of communicative functions. These are supposed to describe the type of expressivity and/or speaker's attitude. Further, to achieve objectivity at a significant level, a listening test with relatively high number of listeners was realized. The listeners were asked to mark sentences in the corpus using communicative functions. The aim of the test was to acquire a sufficient number of subjective annotations of the expressive recordings so that we would be able to create "objective" annotation. There are several methods to obtain objective evaluation from lots of subjective ones, two of them are presented.

show abstract

Prosodic Phrases and Semantic Accents in Speech Corpus for Czech TTS Synthesis

Cited by 8 publications

References 4 publications

Audiovisual interface for Czech spoken dialogue system

Audiovisual interface for Czech spoken dialogue system

Dialogue act based expressive speech synthesis in limited domain for the Czech language

Listening-Test-Based Annotation of Communicative Functions for Expressive Speech Synthesis

Contact Info

Product

Resources

About