2022
DOI: 10.48550/arxiv.2205.14252
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Self-supervised models of audio effectively explain human cortical responses to speech

Abstract: Self-supervised language models are very effective at predicting high-level cortical responses during language comprehension. However, the best current models of lower-level auditory processing in the human brain rely on either hand-constructed acoustic filters or representations from supervised audio neural networks. In this work, we capitalize on the progress of self-supervised speech representation learning (SSL) to create new stateof-the-art models of the human auditory system. Compared against acoustic ba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 35 publications
0
7
0
Order By: Relevance
“…However, the overall variance explained was very low. Similarly, Vaidya et al 29 demonstrated that certain self-supervised speech models capture distinct stages of speech processing. Our results complement these findings in showing that they apply to a large set of models and to responses to natural sounds more generally.…”
Section: Discussionmentioning
confidence: 95%
See 4 more Smart Citations
“…However, the overall variance explained was very low. Similarly, Vaidya et al 29 demonstrated that certain self-supervised speech models capture distinct stages of speech processing. Our results complement these findings in showing that they apply to a large set of models and to responses to natural sounds more generally.…”
Section: Discussionmentioning
confidence: 95%
“…It is nonetheless conceivable (and perhaps likely) that fully accurate models will require learning algorithms that more closely resemble the optimization processes of biology, in which nested loops of evolutionary selection and (largely unsupervised) learning over development combine to produce systems that can perform a wide range of tasks with breathtaking accuracy and efficiency. Some steps in this direction can be found in recent models that are optimized without labeled training data 27, 29, 86, 87 . Our model set contained one such contrastive self-supervised model (Wav2vec2), and although its brain predictions were worse than those of most of the supervised models, this direction clearly merits extensive exploration.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations