Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-875
|View full text |Cite
|
Sign up to set email alerts
|

Sound Pattern Matching for Automatic Prosodic Event Detection

Abstract: Prosody in speech is manifested by variations of loudness, exaggeration of pitch, and specific phonetic variations of prosodic segments. For example, in the stressed and unstressed syllables, there are differences in place or manner of articulation, vowels in unstressed syllables may have a more central articulation, and vowel reduction may occur when a vowel changes from a stressed to an unstressed position.In this paper, we characterize the sound patterns using phonological posteriors to capture the phonetic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
3
3
1

Relationship

4
3

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 17 publications
0
6
0
Order By: Relevance
“…We use our open-source phonological vocoding platform [16] to perform phonological analysis and synthesis. Briefly, the platform is based on cascaded speech analysis and synthesis that works internally with the phonological speech representation.…”
Section: Methodsmentioning
confidence: 99%
“…We use our open-source phonological vocoding platform [16] to perform phonological analysis and synthesis. Briefly, the platform is based on cascaded speech analysis and synthesis that works internally with the phonological speech representation.…”
Section: Methodsmentioning
confidence: 99%
“…The Jaccard distance and simple matching coefficient (the latter is also known as the Rand similarity coefficient) are two types of distance measures that have been used to compare similarity and diversity, including patterns in DNA (Deagle et al, 2017), images (Devereux et al, 2013), and voices (Cernak et al, 2016). The spatial frequency FFT has been used to filter noisy or complex patterns in complex image sets (Petrou & Petrou, 2011).…”
Section: Shapesmentioning
confidence: 99%
“…Using an emphasis detection module combined with ASR-based automatic time alignment, it is possible to identify which word is emphasised in a sentence and its boundaries (we do not tackle this problem in this work; it can be solved using different methods, e.g. [11,12]). In our previous work [10], given parallel data including neutral and emphasised speech, by retrieving the parameters of our model for both cases, we showed that adding the most prominent atoms from an emphatic word in a neutral sentence consistently increased the perception of emphasis on the target word.…”
Section: Application Of Gcr Model To Emphasis Transfermentioning
confidence: 99%
“…In the more general framework of translating emphasis in S2ST, some emphasis detection system (e.g. the recent work of Cernak and colleagues [11,12]) can be used to provide the machine translation additional information, which can further be transmitted to the TTS system. For this study, we restrict ourselves to the intra-lingual case, but due to the language independence of the intonation model used, it seems reasonable to assume that this method can work for any given language.…”
Section: Introductionmentioning
confidence: 99%