Sorting through the noise: Testing robustness of information processing in pre-trained language models

Pandia, Lalchand; Ettinger, Allyson

doi:10.18653/v1/2021.emnlp-main.119

Cited by 15 publications

(12 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Through careful analyses of their output, we can assess just how much can be learned from the statistical regularities of the linguistic environment (Futrell et al., 2019; Wilcox, Futrell, & Levy, 2022). Some of this work has already been done in the context of encoder‐only masked language models, such as BERT and its related descendants (Ettinger, 2020; Pandia & Ettinger, 2021; Rogers, Kovaleva, & Rumshisky, 2020). Their failures, such as with semantic coherence or pragmatics (Arehalli, Dillon, & Linzen, 2022; Dou, Forbes, Koncel‐Kedziorski, Smith, & Choi, 2022; McClelland et al., 2020), are also interesting and point to other central tenets of usage‐based theories such as the role of environmental contexts, developmental histories, cognitive machinery, and functional pressures in human language learning and use (Christiansen & Chater, 2022).…”

mentioning

confidence: 99%

Large Language Models Demonstrate the Potential of Statistical Learning in Language

2023

View full text Add to dashboard Cite

To what degree can language be acquired from linguistic input alone? This question has vexed scholars for millennia and is still a major focus of debate in the cognitive science of language. The complexity of human language has hampered progress because studies of language–especially those involving computational modeling–have only been able to deal with small fragments of our linguistic skills. We suggest that the most recent generation of Large Language Models (LLMs) might finally provide the computational tools to determine empirically how much of the human language ability can be acquired from linguistic experience. LLMs are sophisticated deep learning architectures trained on vast amounts of natural language data, enabling them to perform an impressive range of linguistic tasks. We argue that, despite their clear semantic and pragmatic limitations, LLMs have already demonstrated that human‐like grammatical language can be acquired without the need for a built‐in grammar. Thus, while there is still much to learn about how humans acquire and use language, LLMs provide full‐fledged computational models for cognitive scientists to empirically evaluate just how far statistical learning might take us in explaining the full complexity of human language.

show abstract

mentioning

confidence: 99%

Large Language Models Demonstrate the Potential of Statistical Learning in Language

2023

View full text Add to dashboard Cite

show abstract

“…Through our experiments, we found this criterion to be met in a majority of the cases, suggesting a strong capacity of models to demonstrate property inheritance. However, post-hoc analyses revealed that for most models, this capacity drastically decreases in the presence of distracting information (sometimes even worse than random-guessing), suggesting a clear lack of robustness in the information processing capacities of PLMs, similar to the results of Pandia and Ettinger (2021). In contrast to their results, we find that larger models are generally more distracted than are smaller models, and this especially happens when the distracting information is closer to the predicted property-phrases, suggesting the presence of a proximity effect.…”

Section: General Discussion and Conclusionmentioning

confidence: 58%

“…To what extent does the compatibility of models with H1 hold in presence of distracting information? This question is inspired by Pandia and Ettinger (2021), who report a substantial decrease in perceived information processing capacity of PLMs in the presence of semantic distractors. Here, we transform the stimuli of COMPS-WUGS by creating two different subordinates for every minimal pair: one for the positive concept (e.g.…”

Section: Post-hoc Robustness Evaluation With Distracting Informationmentioning

confidence: 99%

COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models

Misra¹,

Rayz²,

Ettinger³

2022

Preprint

View full text Add to dashboard Cite

A characteristic feature of human semantic memory is its ability to not only store and retrieve the properties of concepts observed through experience, but to also facilitate the inheritance of properties (can breathe) from superordinate concepts (ANIMAL) to their subordinates (DOG)-i.e. demonstrate property inheritance. In this paper, we present COMPS, a collection of minimal pair sentences that jointly tests pre-trained language models (PLMs) on their ability to attribute properties to concepts and their ability to demonstrate property inheritance behavior. Analyses of 22 different PLMs on COMPS reveal that they can easily distinguish between concepts on the basis of a property when they are trivially different, but find it relatively difficult when concepts are related on the basis of nuanced knowledge representations. Furthermore, we find that PLMs can demonstrate behavior consistent with property inheritance to a great extent, but fail in the presence of distracting information, which decreases the performance of many models, sometimes even below chance. This lack of robustness in demonstrating simple reasoning raises important questions about PLMs' capacity to make correct inferences even when they appear to possess the prerequisite knowledge.

show abstract

“…This again suggests some isomorphism between human language processing and DL-based models. The next word prediction objective also enables language models to perform well on psycholinguistic diagnostics like the cloze task, although there is substantial room for improvement ( Ettinger, 2020 ; Pandia & Ettinger, 2021 ). Finally, self-supervised ANNs, that is, networks that predict the next word or speech frame, transfer well to downstream language tasks like question answering and coreference resolution, and to speech tasks like speaker verification and translation across languages ( Z. Chen et al, 2022 ; A. Wu et al, 2020 ).…”

Section: Experimental Designs In Language Neurosciencementioning

confidence: 99%

Computational Language Modeling and the Promise of In Silico Experimentation

Jain

Wehbe

et al. 2024

Neurobiology of Language

View full text Add to dashboard Cite

Language neuroscience currently relies on two major experimental paradigms: controlled experiments using carefully hand-designed stimuli, and natural stimulus experiments. These approaches have complementary advantages which allow them to address distinct aspects of the neurobiology of language, but each approach also comes with drawbacks. Here we discuss a third paradigm—in silico experimentation using deep learning-based encoding models—that has been enabled by recent advances in cognitive computational neuroscience. This paradigm promises to combine the interpretability of controlled experiments with the generalizability and broad scope of natural stimulus experiments. We show four examples of simulating language neuroscience experiments in silico and then discuss both the advantages and caveats of this approach.

show abstract

Sorting through the noise: Testing robustness of information processing in pre-trained language models

Cited by 15 publications

References 29 publications

Large Language Models Demonstrate the Potential of Statistical Learning in Language

Large Language Models Demonstrate the Potential of Statistical Learning in Language

COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models

Computational Language Modeling and the Promise of In Silico Experimentation

Contact Info

Product

Resources

About