2023
DOI: 10.1098/rsta.2022.0041
|View full text |Cite
|
Sign up to set email alerts
|

Symbols and grounding in large language models

Abstract: Large language models (LLMs) are one of the most impressive achievements of artificial intelligence in recent years. However, their relevance to the study of language more broadly remains unclear. This article considers the potential of LLMs to serve as models of language understanding in humans. While debate on this question typically centres around models’ performance on challenging language understanding tasks, this article argues that the answer depends on models’ underlying competence, and thus that the f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 35 publications
(14 citation statements)
references
References 69 publications
0
14
0
Order By: Relevance
“…In particular, to match the hypothesized mechanism underlying human behavior for the distributivity experiments (Humphreys & Bock, 2005 ), a model would need to distinguish between, for example, an NP that is more likely to be conceptualized as a single, collective entity and an NP that is more likely to be conceptualized as multiple entities distributed in space. This kind of mapping, from linguistic material to entities in an external world, may lie beyond the abilities of models trained solely on linguistic material at this scale (though see Pavlick, 2023 for evidence that these capacities may emerge when models are trained on orders of magnitude more training data). We speculate that a multi-modal model with a visual training objective may be better able to capture such effects (for a example of a multi-modal model in distributional semantics, see Bruni et al, 2014 ).…”
Section: Simulationsmentioning
confidence: 99%
“…In particular, to match the hypothesized mechanism underlying human behavior for the distributivity experiments (Humphreys & Bock, 2005 ), a model would need to distinguish between, for example, an NP that is more likely to be conceptualized as a single, collective entity and an NP that is more likely to be conceptualized as multiple entities distributed in space. This kind of mapping, from linguistic material to entities in an external world, may lie beyond the abilities of models trained solely on linguistic material at this scale (though see Pavlick, 2023 for evidence that these capacities may emerge when models are trained on orders of magnitude more training data). We speculate that a multi-modal model with a visual training objective may be better able to capture such effects (for a example of a multi-modal model in distributional semantics, see Bruni et al, 2014 ).…”
Section: Simulationsmentioning
confidence: 99%
“…Various reasons for scepticism have been advanced about the significance of GPT-3's impressive performance. Pavlick [ 19 ] argues that certain well-known sceptical concerns are not yet decisive. In particular, she makes the case that objections that large language models like GPT-3 cannot really understand language because they lack internal symbolic structured representations of language, or that their symbols are not ‘grounded’ in perceptuo-motor interaction with the world, are currently unproven.…”
Section: Social Artificial Intelligence: How Close Are We?mentioning
confidence: 99%
“…This makes more detailed analysis of current capabilities rather futile. There are, however, interesting questions about the relationship between AI and cognitive science that are posed by these models, and we refer the reader to two papers in this special issue that pursue these further [39,40].…”
Section: (Iii) Large Language Modelsmentioning
confidence: 99%