Feature Interactions Reveal Linguistic Structure in Language Models

Jumelet, Jaap; Zuidema, Willem

doi:10.18653/v1/2023.findings-acl.554

Cited by 1 publication

(1 citation statement)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Attribution methods have been used to examine linguistic patterns in model behaviour, and it has been argued they provide more comprehensive insights than attention heatmaps (Bastings and Filippova, 2020), because attention only determines feature importance within a particular attention head, and not for model predictions as a whole (Jain and Wallace, 2019). Linguistic phenomena investigated using attribution methods include co-reference, negation, and syntactic structure (Jumelet et al, 2019;Wu et al, 2021;Nayak and Timmapathini, 2021;Jumelet and Zuidema, 2023). Within conversational NLP, feature attribution methods have been used to identify salient features in task-oriented dialogue modelling (Huang et al, 2020), dialogue response generation (Tuan et al, 2021), and turn-taking prediction (Ekstedt and Skantze, 2020).…”

Section: Understanding the Behaviour Of Language Modelsmentioning

confidence: 99%

Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue

Molnar,

Jumelet,

Giulianelli

et al. 2023

Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

Language models are often used as the backbone of modern dialogue systems. These models are pre-trained on large amounts of written fluent language. Repetition is typically penalised when evaluating language model generations. However, it is a key component of dialogue. Humans use local and partner specific repetitions; these are preferred by human users and lead to more successful communication in dialogue. In this study, we evaluate (a) whether language models produce humanlike levels of repetition in dialogue, and (b) what are the processing mechanisms related to lexical re-use they use during comprehension. We believe that such joint analysis of model production and comprehension behaviour can inform the development of cognitively inspired dialogue generation systems.

show abstract