2022
DOI: 10.1038/s41598-022-20460-9
|View full text |Cite
|
Sign up to set email alerts
|

Deep language algorithms predict semantic comprehension from brain activity

Abstract: Deep language algorithms, like GPT-2, have demonstrated remarkable abilities to process text, and now constitute the backbone of automatic translation, summarization and dialogue. However, whether these models encode information that relates to human comprehension still remains controversial. Here, we show that the representations of GPT-2 not only map onto the brain responses to spoken stories, but they also predict the extent to which subjects understand the corresponding narratives. To this end, we analyze … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

11
71
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 62 publications
(82 citation statements)
references
References 67 publications
11
71
0
Order By: Relevance
“…In this study, we address these issues by analysing the brain signals of 304 individuals listening to short stories while their brain activity is recorded with fMRI 39 . After confirming that deep language algorithms linearly map onto brain activity 6,8,40 , we show that enhancing these models with long-range and multi-level predictions improves such brain mapping. Critically, and in line with predictive coding theory, our results reveal a hierarchical organization of language predictions in the cortex, in which the highest areas predict the most distant and highest-level representations.…”
Section: Isolating Long-range Predictions In the Brainsupporting
confidence: 56%
“…In this study, we address these issues by analysing the brain signals of 304 individuals listening to short stories while their brain activity is recorded with fMRI 39 . After confirming that deep language algorithms linearly map onto brain activity 6,8,40 , we show that enhancing these models with long-range and multi-level predictions improves such brain mapping. Critically, and in line with predictive coding theory, our results reveal a hierarchical organization of language predictions in the cortex, in which the highest areas predict the most distant and highest-level representations.…”
Section: Isolating Long-range Predictions In the Brainsupporting
confidence: 56%
“…The proposed approach is fundamentally different from a purely data-driven one that identifies neural response patterns correlated with pooled activities from hidden layers of a neural network trained on specific tasks of next-input predictions such as in (62, 64, 65). The brain interacts with the external stimuli, whether linguistic or not, in a structured fashion that is likely reused across different domains (44, 58).…”
Section: Discussionmentioning
confidence: 99%
“…Two types of information theoretic metrics have been of particular interest in establishing the connection between abstract information and biophysical signals to probe the brain's information processing capacity: surprisal (related to, but distinct from divergence) and entropy. Efforts in associating neurophysiological responses to surprisal for next-word expectation, either based on cloze probability tests (32,(59)(60)(61) or the probabilistic distribution estimated by computational models (35)(36)(37)(62)(63)(64)(65), largely credit Levy's influential work on expectation-based comprehension (10). Levy proposed a formal relationship between incremental comprehension effort and the Kullback-Leibler divergence (KLD) of syntactic structure inference before and after receiving a word input W, and proved that the KLD reduced to the surprisal of W given the previous word string when conditioned on a constant extrasentential context that constrains comprehension.…”
Section: Understanding Neural Information Transfer Through Divergence...mentioning
confidence: 99%
“…These include predicting the next word in a sequence, utilising contextual information to generate those predictions, and calculating the surprise when predictions are violated. A number of studies employing sentence comprehension paradigms now support the notion that hierarchical representations allowing probabilistic computations operate in both humans and DLNs (27)(28)(29), with transformer-like predictive processing explaining nearly 100% of the explainable variance in neural activity during sentence processing tasks (29). Despite this impressive correspondence with expected brain activity, when generating human-like language, both rule-based and DLN-based NLG results in numerous errors, some of which, as we discuss below, are reminiscent of psychotic symptoms.…”
Section: Natural Language Generationmentioning
confidence: 99%
“…Connectionist models employing neural networks (in this case, DLNs) are particularly appealing in psychosis given the ample evidence implicating disturbances in cognitive operations implemented by distributed brain networks as a core feature of conditions such as schizophrenia (70). In silico or toy models based on neural networks also provide a means to interpret neuroimaging data obtained from human participants (27)(28)(29). While a number of neural network models have been used previously to study psychosis, we see several distinct advantages with DLNs.…”
Section: Factors Contributing To Nlg Errorsmentioning
confidence: 99%