2022
DOI: 10.1101/2022.10.04.510681
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Artificial neural network language models predict human brain responses to language even after a developmentally realistic amount of training

Abstract: Artificial neural networks have emerged as computationally plausible models of human language processing. A major criticism of these models is that the amount of training data they receive far exceeds that of humans during language learning. Here, we use two complementary approaches to ask how the models ability to capture human neural and behavioral responses to language is affected by the amount of training data. First, we evaluate GPT-2 models trained on 1 million, 10 million, 100 million, or 1 billion toke… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
28
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 18 publications
(29 citation statements)
references
References 78 publications
1
28
0
Order By: Relevance
“…A number of studies have now shown that representations extracted from neural network models of language can capture neural responses to language in human brains, as recorded with fMRI or intracranial methods (e.g., [134][135][136][46][47][48][49][50][51][52]55,58 ). This study goes beyond prior work in four ways.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…A number of studies have now shown that representations extracted from neural network models of language can capture neural responses to language in human brains, as recorded with fMRI or intracranial methods (e.g., [134][135][136][46][47][48][49][50][51][52]55,58 ). This study goes beyond prior work in four ways.…”
Section: Discussionmentioning
confidence: 99%
“…Second, we evaluated our encoding model on new individuals. Most studies that have used language models to model human neural responses to language (e.g., 135,136,46,[48][49][50][51][52]55,53,54,57,58 ; cf. 56,59 ) fit an encoding model using a given participant's data and then use that model to predict responses to held-out stimuli in that same participant.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…An open question is how DLMs will perform when trained on more humanlike input: data that are not textual but spoken, multimodal, embodied and immersed in social actions. Interestingly, two recent papers suggest that language models trained on more realistic human-centered data can learn a language like children (53,54). However, additional research is needed to explore these questions.…”
Section: Discussionmentioning
confidence: 99%
“…In an even more recent preprint, Goldstein et al (2022b) found a correspondence between GPT-2's per-layer activation and the time course of language processing using EcoG (i.e., recordings from electrodes implanted in the brain). Importantly, this processing parallel does not seem to be a function of the unrealistic number of tokens to which LLMs are exposed during training: Hosseini et al (2022) found that GPT-2 embeddings can be used to predict the activation of the language network in an fMRI neuroimaging study even when trained on 100 million tokens or the equivalent of the first 10 years of a child's language exposure (Gilkerson et al, 2017). All this considering that the scale of GPT-2 is orders of magnitude smaller than the most recent wave of LLMs (Dettmers et al, 2022).…”
mentioning
confidence: 99%