2022
DOI: 10.1101/2022.06.08.495348
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Shared functional specialization in transformer-based language models and the human brain

Abstract: Piecing together the meaning of a narrative requires understanding not only the individual words but also the intricate relationships between them. How does the brain construct this kind of rich, contextual meaning from natural language? Recently, a new class of artificial neural networks—based on the Transformer architecture—has revolutionized the field of language modeling. Transformers integrate information across words via multiple layers of structured circuit computations, forming increasingly contextuali… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

5
43
0
1

Year Published

2022
2022
2025
2025

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 30 publications
(49 citation statements)
references
References 139 publications
5
43
0
1
Order By: Relevance
“…Second, we showed that it is possible to explore the internal reasoning of the "black box" of a DL language model and derive corresponding metrics to map the neural mechanism underlying language comprehension. In fact, while the saliency scores represent a useful tool for the developers to understand the behavior of a DL model, obtaining a more "explainable" artificial intelligence (AI) tool 16,27 , our findings demonstrate that these scores could be also used as neural features to shed light on neural mechanisms that are either partially unknown or simply not yet (or not completely) modeled by other metrics 15 . Furthermore, our findings are in line with a recent study demonstrating (with a different neuroimaging technique) that (i) the neural responses before the onset of a word contain valuable information about the incoming word, (ii) such information can be suitably extracted from the GPT-2 using its contextual embeddings 5 .…”
Section: Discussionmentioning
confidence: 86%
See 4 more Smart Citations
“…Second, we showed that it is possible to explore the internal reasoning of the "black box" of a DL language model and derive corresponding metrics to map the neural mechanism underlying language comprehension. In fact, while the saliency scores represent a useful tool for the developers to understand the behavior of a DL model, obtaining a more "explainable" artificial intelligence (AI) tool 16,27 , our findings demonstrate that these scores could be also used as neural features to shed light on neural mechanisms that are either partially unknown or simply not yet (or not completely) modeled by other metrics 15 . Furthermore, our findings are in line with a recent study demonstrating (with a different neuroimaging technique) that (i) the neural responses before the onset of a word contain valuable information about the incoming word, (ii) such information can be suitably extracted from the GPT-2 using its contextual embeddings 5 .…”
Section: Discussionmentioning
confidence: 86%
“…Recent studies have shown that DL language models based on the transformer architecture 9 , such as the GPT-2, have remarkable performances in explaining the neural correlates of language comprehension and likely share some computational principles with the human brain [4][5][6]15 . In this work, we provided further support to this hypothesis, and contributed some novel insights about two basic neural mechanisms underpinning language comprehension in humans: word prediction and context-related language processing.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations