Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.143
|View full text |Cite
|
Sign up to set email alerts
|

Implicit Representations of Meaning in Neural Language Models

Abstract: Does the effectiveness of neural language models derive entirely from accurate modeling of surface word co-occurrence statistics, or do these models represent and reason about the world they describe? In BART and T5 transformer language models, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of ea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
30
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 37 publications
(31 citation statements)
references
References 22 publications
1
30
0
Order By: Relevance
“…The tasks in this paper can be viewed as exploring one criticism of large language models, namely, to what extent do they simply rely on surface-level statistical correlations on text, without learning semantics or world knowledge (Bender & Koller, 2020)? In response, Li et al (2021) provide evidence that pre-trained language models do indeed construct approximate representations of the semantics of the situations they describe in text. In the context of programs, Austin et al (2021) approach this question by exploring the learning to execute task on MBPP, which we consider in Section 5.2.…”
Section: Related Workmentioning
confidence: 92%
“…The tasks in this paper can be viewed as exploring one criticism of large language models, namely, to what extent do they simply rely on surface-level statistical correlations on text, without learning semantics or world knowledge (Bender & Koller, 2020)? In response, Li et al (2021) provide evidence that pre-trained language models do indeed construct approximate representations of the semantics of the situations they describe in text. In the context of programs, Austin et al (2021) approach this question by exploring the learning to execute task on MBPP, which we consider in Section 5.2.…”
Section: Related Workmentioning
confidence: 92%
“…Geiger et al (2020) initially used interchange interventions to show that a BERT-based natural language inference model creates a neural representation of lexical entailment, arguing that the ability to create modular representations underlies the capacity for systematic generalization. Li et al (2021) use interchange interventions to show that neural representations represent propositional content that systematically alters text generated by a neural language model. Geiger et al (2021) explicitly ground interchange intervention analysis in a theory of causal abstraction.…”
Section: Methods For Explaining the Internal Structure Of Aimentioning
confidence: 99%
“…Comparison to probing results Recently, Li et al (2021) developed a probe for investigating whether LM representations provide information about the state of entities at various stages in a larger discourse. This probing method-like the ones presented in this work-also aims to assess entity tracking abilities of pre-trained language mod-els.…”
Section: Nlg Evaluationmentioning
confidence: 99%
“…First, the probing classifier was trained on data that was similar to the evaluation data and this setup therefore provided a lot of supervision. Second, the datasets used by Li et al (2021) were obtained through crowdsourcing or a generation engine and were not constructed as systematically as ours. For these reasons, the probing classifier may have learned spurious correlations between the training and test splits, and the high accuracy on the task may have only in part been driven by entity tracking abilities of LMs.…”
Section: Nlg Evaluationmentioning
confidence: 99%