Discourse structure interacts with reference but not syntax in neural language models

Davis, Forrest; Schijndel, Marten van

doi:10.18653/v1/2020.conll-1.32

Cited by 13 publications

(17 citation statements)

References 34 publications

(51 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Prior work has noted competing generalizations influencing model behavior via the distinction of non-linguistic vs. linguistic biases (e.g., Mc-Coy et al, 2019;Davis and van Schijndel, 2020a;Warstadt et al, 2020b). The findings in Warstadt et al (2020b), that linguistic knowledge is represented within a model much earlier than attestation in model behavior, bears resemblance to our claims.…”

Section: Related Worksupporting

confidence: 83%

“…Given that all the investigated stimuli were disambiguated by gender, we categorized our results by the antecedent of the pronoun and the IC verb bias. We first turn to English and Chinese, which showed an IC bias in line with existing work on IC bias in autoregressive English models (e.g., Upadhye et al, 2020;Davis and van Schijndel, 2020a). We then detail the results for Spanish and Italian, where only very limited, if any, IC bias was observed.…”

Section: Experimental Stimuli and Measuresmentioning

confidence: 55%

“…Current accounts of IC ground the phenomenon within the linguistic signal without the need for additional pragmatic inferences by comprehenders (e.g., Ro-hde et al, 2011;Hartshorne et al, 2013). Recent investigations of IC in neural language models confirms that the IC bias of English is learnable, at least to some degree, from text data alone (Davis and van Schijndel, 2020a;Upadhye et al, 2020). The ability of models trained on other languages to acquire an IC bias, however, has not been explored.…”

Section: Introductionmentioning

confidence: 99%

“…Thus, the apparent failure of the Spanish and Italian models to pattern like English and Chinese is not evidence on its own of a model's inability to acquire the requisite linguistic 1 These model types were chosen for ease of access to existing models. Pretrained, large auto-regressive models are largely restricted to English, and prior work suggests that LSTMs are limited in their ability to acquire an IC bias in English (Davis and van Schijndel, 2020a).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning

Davis¹,

Schijndel²

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

A growing body of literature has focused on detailing the linguistic knowledge embedded in large, pretrained language models. Existing work has shown that non-linguistic biases in models can drive model behavior away from linguistic generalizations. We hypothesized that competing linguistic processes within a language, rather than just non-linguistic model biases, could obscure underlying linguistic knowledge. We tested this claim by exploring a single phenomenon in four languages: English, Chinese, Spanish, and Italian. While human behavior has been found to be similar across languages, we find cross-linguistic variation in model behavior. We show that competing processes in a language act as constraints on model behavior and demonstrate that targeted fine-tuning can re-weight the learned constraints, uncovering otherwise dormant linguistic knowledge in models. Our results suggest that models need to learn both the linguistic constraints in a language and their relative ranking, with mismatches in either producing non-human-like behavior.

show abstract

Section: Related Worksupporting

confidence: 83%

Section: Experimental Stimuli and Measuresmentioning

confidence: 55%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning

Davis¹,

Schijndel²

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

show abstract

“…While they did not find strong evidence for a correlation to human-based results in this respect, they did observe that in the context of connective because PLMs assigned lower probability to subjectreferring pronouns for an object-biasing verb as compared to a subject-biasing verb. Davis and van Schijndel (2020) observed that GPT2-XL (Radford et al, 2019) encodes some level of IC bias in its representations (measured in terms of similarity between the representation of the pronoun and its two potential referents) and its decision on how to resolve a referent at prediction time is weakly influenced by that. They took the analysis one step further and looked at whether GPT2-XL uses IC information to resolve relative clause attachment, which in humans is conditioned by IC bias-no evidence was found to suggest that that was the case.…”

Section: Related Workmentioning

confidence: 99%

John praised Mary because _he_? Implicit Causality Bias and Its Interaction with Explicit Cues in LMs

Kementchedjhieva¹,

Anderson²,

Søgaard³

2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Some interpersonal verbs can implicitly attribute causality to either their subject or their object and are therefore said to carry an implicit causality (IC) bias. Through this bias, causal links can be inferred from a narrative, aiding language comprehension. We investigate whether pre-trained language models (PLMs) encode IC bias and use it at inference time. We find that to be the case, albeit to different degrees, for three distinct PLM architectures. However, causes do not always need to be implicit-when a cause is explicitly stated in a subordinate clause, an incongruent IC bias associated with the verb in the main clause leads to a delay in human processing. We hypothesize that the temporary challenge humans face in integrating the two contradicting signals, one from the lexical semantics of the verb, one from the sentence-level semantics, would be reflected in higher error rates for models on tasks dependent on causal links. The results of our study lend support to this hypothesis, suggesting that PLMs tend to prioritize lexical patterns over higher-order signals.

show abstract

Language Model Behavior: A Comprehensive Survey

Chang,

Bergen

2024

Computational Linguistics

View full text Add to dashboard Cite

Transformer language models have received widespread public attention, yet their generated text is often surprising even to NLP researchers. In this survey, we discuss over 250 recent studies of English language model behavior before task-specific fine-tuning. Language models possess basic capabilities in syntax, semantics, pragmatics, world knowledge, and reasoning, but these capabilities are sensitive to specific inputs and surface features. Despite dramatic increases in generated text quality as models scale to hundreds of billions of parameters, the models are still prone to unfactual responses, commonsense errors, memorized text, and social biases. Many of these weaknesses can be framed as over-generalizations or under-generalizations of learned patterns in text. We synthesize recent results to highlight what is currently known about large language model capabilities, thus providing a resource for applied work and for research in adjacent fields that use language models.

show abstract

Discourse structure interacts with reference but not syntax in neural language models

Cited by 13 publications

References 34 publications

Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning

Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning

John praised Mary because _he_? Implicit Causality Bias and Its Interaction with Explicit Cues in LMs

Language Model Behavior: A Comprehensive Survey

Contact Info

Product

Resources

About