1995
DOI: 10.1037/0033-295x.102.3.594
|View full text |Cite
|
Sign up to set email alerts
|

Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production.

Abstract: This article describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on variability, motor equivalence, coarticulation, and rate effects. Model parameters are learned during a babbling phase. To explain how infants learn language-specific variability limits, speech sound targets take the form of convex regions, rather than points, in orosensory coordinates. Reducing target size for better accuracy during slower speech leads to differential eff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

9
351
0
8

Year Published

1998
1998
2021
2021

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 379 publications
(374 citation statements)
references
References 109 publications
9
351
0
8
Order By: Relevance
“…Two examples will suffice to indicate some of the issues. First, it is well known in speech production that targets are better treated as regions than as points, and Guenther ( 1995) recently incorporated ideas from the VITE and DIRECT models in his DIY A (Directions Into Velocities of Articulators) model, which incorporates such regions as a basic postulate. It may be that the emergence of such regions can be attributed to the use of learned TTC thresholds to terrrlinate approaches to points near the center of such regions.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Two examples will suffice to indicate some of the issues. First, it is well known in speech production that targets are better treated as regions than as points, and Guenther ( 1995) recently incorporated ideas from the VITE and DIRECT models in his DIY A (Directions Into Velocities of Articulators) model, which incorporates such regions as a basic postulate. It may be that the emergence of such regions can be attributed to the use of learned TTC thresholds to terrrlinate approaches to points near the center of such regions.…”
Section: Resultsmentioning
confidence: 99%
“…They also offered an explanation for kinematic properties of reaching movements that included the bell shaped velocity profile, Woodworth's law, Fitts's law and variations in velocity profile symmetry as a function of movement duration. The model has been extended in the form of the DIRECT and DIY A models to incorporate inverse differential transformations between task space and motor coordinates for both reaching Fiala, 1994Fiala, , 1995 and speech production (Guenther, 1992;Guenther, 1994Guenther, , 1995. These transformations were shown to enable motor equivalence and to be learnable with the help of perceptual information generated by action.…”
Section: Synch1·onous Trajectory Generation By Vitementioning
confidence: 99%
“…Such a mapping has been termed a "forward model" by Jordan (1990) and has been used in different capacities in adaptive models of speech production (e.g., Bailly eta!., 1991; Guenther, 1994Guenther, , 1995a and other motor tasks such as reaching (e.g., Bullock, Grossberg, and Guenther, 1993;Jordan, 1990). A typical neural network construct for learning a forward model is illustrated in Figure 3.…”
Section: February Ll 1997mentioning
confidence: 99%
“…This replaces the constriction-based planning frame used in earlier versions ofthe DIVA model (Guenther, 1994(Guenther, , 1995a. The Planning Position Vector in the model represents the current state of the vocal tract within the auditory perceptual reference frame.…”
Section: Introduction: Reference Frames and The Targets Of Speech Promentioning
confidence: 99%
“…The model is implemented in computer simulations that control an articulatory synthesizer (Maeda, 1990) in order to produce an acoustic signal. The articulator movements and acoustic signal produced by the model can be compared to the productions of human speakers; the results of many such comparisons are described elsewhere (Callan, Kent, Guenther, & Vorperian, 2000;Guenther, 1995;Guenther, Hampson, & Johnson, 1998;Guenther et al, 1999;Nieto-Castanon, Guenther, Perkell, & Curtin, 2005;Perkell et al, 2004a,b).…”
Section: Introductionmentioning
confidence: 99%