2020
DOI: 10.1162/neco_a_01264
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the Potential Gain of Auditory and Audiovisual Speech-Predictive Coding Using Deep Learning

Abstract: Sensory processing is increasingly conceived in a predictive framework in which neurons would constantly process the error signal resulting from the comparison of expected and observed stimuli. Surprisingly, few data exist on the accuracy of predictions that can be computed in real sensory scenes. Here, we focus on the sensory processing of auditory and audiovisual speech. We propose a set of computational models based on artificial neural networks (mixing deep feedforward and convolutional networks), which ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 81 publications
0
3
0
Order By: Relevance
“…In this first experiment, we ask whether it is possible to induce 'innate' speech sound discrimination capabilities in our model and how the resulting initial state affects its developmental trajectory. Following Lavechin et al (2024), we chose a learning algorithm relying on auditory predictive coding at the core of the predictive brain hypothesis that has gained attention in the neuroscience community (Huang & Rao, 2011;Hueber et al, 2020). The algorithm learns by predicting future representations of audio based on present and past ones (see Methods).…”
Section: Experiments 1: Inducing Initial Speech Sound Discrimination ...mentioning
confidence: 99%
“…In this first experiment, we ask whether it is possible to induce 'innate' speech sound discrimination capabilities in our model and how the resulting initial state affects its developmental trajectory. Following Lavechin et al (2024), we chose a learning algorithm relying on auditory predictive coding at the core of the predictive brain hypothesis that has gained attention in the neuroscience community (Huang & Rao, 2011;Hueber et al, 2020). The algorithm learns by predicting future representations of audio based on present and past ones (see Methods).…”
Section: Experiments 1: Inducing Initial Speech Sound Discrimination ...mentioning
confidence: 99%
“…Our learner model consists of two main components: 1) an acoustic model that incorporates a Contrastive Predictive Coding (CPC) algorithm followed by a K-means algorithm, which is in charge of learning discrete representations of the audio; and 2) a language model made of Long Short-Term Memory (LSTM) layers trained on the learned discrete units. The primary learning objective is to predict future audio observations from present and past ones, a process known as auditory predictive coding at the core of the predictive brain hypothesis that has attracted the attention of the neuroscience community (Barlow et al, 1961;Keller & Mrsic-Flogel, 2018;Hueber et al, 2020). As a statistical learning algorithm, the proposed model can provide us with insights about the aspects of language that can be acquired through statistical learning mechanisms applied to raw speech.…”
Section: Proposed Approachmentioning
confidence: 99%
“…Predictive coding is a technique to improve compression performance through statistical redundancy. Based on the previously encoded pixel values, the encoder can estimate and predict the pixel values to be encoded and decoded [26][27][28]. For a large number of static or slowly varying regions in the sequence image, the conditional patching method can be used to store the first frame image in the reference frame and send it to the other party.…”
Section: Predictive Codingmentioning
confidence: 99%