2017
DOI: 10.1017/cnj.2017.15
|View full text |Cite
|
Sign up to set email alerts
|

Low-level articulatory synthesis: A working text-to-speech solution and a linguistic tool

Abstract: A complete text-to-speech system has been created by the authors, based on a tube resonance model of the vocal tract and a development of Carré’s “Distinctive Region Model”, which is in turn based on the formant-sensitivity findings of Fant and Pauli (1974), to control the tube. In order to achieve this goal, significant long-term linguistic research has been involved, including rhythm and intonation studies, as well as the development of low-level articulatory data and rules to drive the model, together with … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 51 publications
(67 reference statements)
0
7
0
Order By: Relevance
“…Curiously, it becomes no less challenging as we attempt to break spoken language usage into its smallest bits. For instance, even at the fine scale of phonemes pairing individual vowels with individual consonants, speech modeling continues to the present day to struggle with “coarticulation” (Hill et al, 2017 ), i.e., that the physiological production of each phoneme changes depending on its context in the speech stream. Human listeners manage to compensate for coarticulation to the point of responding systematically to audible changes with context—even without explicitly noticing the context effects and even, instead, with having the experience of stable phonemic recognition (Zamuner et al, 2016 ; Viswanathan and Kelty-Stephen, 2018 ).…”
Section: Introductionmentioning
confidence: 99%
“…Curiously, it becomes no less challenging as we attempt to break spoken language usage into its smallest bits. For instance, even at the fine scale of phonemes pairing individual vowels with individual consonants, speech modeling continues to the present day to struggle with “coarticulation” (Hill et al, 2017 ), i.e., that the physiological production of each phoneme changes depending on its context in the speech stream. Human listeners manage to compensate for coarticulation to the point of responding systematically to audible changes with context—even without explicitly noticing the context effects and even, instead, with having the experience of stable phonemic recognition (Zamuner et al, 2016 ; Viswanathan and Kelty-Stephen, 2018 ).…”
Section: Introductionmentioning
confidence: 99%
“…Both speech production and language perception exemplify just such nonlinear interactions across timescales. Longer-term speech sequences reshape the brief articulation of a single syllable [5]. Communicative sounds exhibit a range of fractional power law exponents [44].…”
Section: Multifractality To Portray Cross-scale Interactions Blending Fleeting Behaviours With Longer-term Behavioursmentioning
confidence: 99%
“…Motor processes reshape phonemes in context- and sequence-dependent ways (e.g. in coarticulation) that text-to-speech synthesis has struggled to emulate [5]. Speech sounds reflect movements beyond articulators to distal parts of the body [6]—potentially explaining their utility for diagnosing Parkinson's disease [7].…”
Section: Introductionmentioning
confidence: 99%
“…Speech perception critically depends on the articulatory gestures shaping those acoustic features [4][5][6]. Human motor processes reshape phonemes in a context-and sequence-dependent way (e.g., in coarticulation) that text-to-speech synthesis has struggled to emulate [7]. Speech sounds reflect movements beyond articulators to distal parts of the body [8]-potentially explaining their utility for diagnosing Parkinson's disease [9].…”
Section: Introductionmentioning
confidence: 99%
“…Multifractality nonlinearity t MF may predict expressiveness [44]. Longer-term speech sequences reshape the brief articulation of a single syllable [7]. Speech perception adjusts to compensate for coarticulatory interaction across time [4] and tailors these compensations over longer timescales [45].…”
Section: Introductionmentioning
confidence: 99%