2020
DOI: 10.3389/frai.2020.00044
|View full text |Cite
|
Sign up to set email alerts
|

Generative Adversarial Phonology: Modeling Unsupervised Phonetic and Phonological Learning With Neural Networks

Abstract: Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations. This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture and proposes a methodology to uncover the network's internal representations that correspond to phonetic and phonological properties. The Generative Adversarial architecture is uniquely appropr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
111
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 25 publications
(113 citation statements)
references
References 98 publications
2
111
0
Order By: Relevance
“…The result of the training in the architecture outlined in Figure 1 is a Generator network that outputs raw acoustic data that resemble real data from the TIMIT database, such that the Discriminator becomes unsuccessful in assigning "realness" scores (Brownlee, 2019). Crucially, unlike in other architectures, the Generator's outputs are never a full replication of the input: the Generator outputs innovative data that resemble input data, but also violate many of the distributions in a linguistically interpretable manner (Beguš, 2020). In addition to outputting innovative data that resemble speech in the input, the Generator also learns to associate each lexical item with a unique code in its latent space.…”
Section: Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…The result of the training in the architecture outlined in Figure 1 is a Generator network that outputs raw acoustic data that resemble real data from the TIMIT database, such that the Discriminator becomes unsuccessful in assigning "realness" scores (Brownlee, 2019). Crucially, unlike in other architectures, the Generator's outputs are never a full replication of the input: the Generator outputs innovative data that resemble input data, but also violate many of the distributions in a linguistically interpretable manner (Beguš, 2020). In addition to outputting innovative data that resemble speech in the input, the Generator also learns to associate each lexical item with a unique code in its latent space.…”
Section: Modelmentioning
confidence: 99%
“…Language acquisition has, to the author's knowledge, not been modeled with the GAN architecture prior to Beguš (2020), despite several aspects of the architecture that can be paralleled to language acquisition. Beguš (2020) proposes that phonetic and phonological learning can simultaneously be modeled as a dependency between latent space and output data in Deep Convolutional Generative Adversarial Networks (Goodfellow et al, 2014;Radford et al, 2015;Donahue et al, 2019). Unlike in the autoencoder architectures, the outputs of the GAN models are innovative, not directly connected to the inputs, and violate training data distributions in highly informative ways.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations