Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics 2021
DOI: 10.18653/v1/2021.starsem-1.2
|View full text |Cite
|
Sign up to set email alerts
|

Can Transformer Language Models Predict Psychometric Properties?

Abstract: Transformer-based language models (LMs) continue to advance state-of-the-art performance on NLP benchmark tasks, including tasks designed to mimic human-inspired "commonsense" competencies. To better understand the degree to which LMs can be said to have certain linguistic reasoning skills, researchers are beginning to adapt the tools and concepts of the field of psychometrics. But to what extent can the benefits flow in the other direction? I.e., can LMs be of use in predicting what the psychometric propertie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

2
6

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 61 publications
0
8
0
Order By: Relevance
“…In particular, we hope to encourage the the development of new datasets that cover richer types of linguistic phenomena and language models that can learn essential linguistic skills that are generalizable. For future work, we plan to add more datasets to cover more phenomena such as psycholinguistics (Laverghetta Jr. et al, 2021). We envision our benchmark to be dynamic, meaning that a higher-quality and more difficult dataset for a phenomenon should replace the current ones in the future.…”
Section: Discussionmentioning
confidence: 99%
“…In particular, we hope to encourage the the development of new datasets that cover richer types of linguistic phenomena and language models that can learn essential linguistic skills that are generalizable. For future work, we plan to add more datasets to cover more phenomena such as psycholinguistics (Laverghetta Jr. et al, 2021). We envision our benchmark to be dynamic, meaning that a higher-quality and more difficult dataset for a phenomenon should replace the current ones in the future.…”
Section: Discussionmentioning
confidence: 99%
“…We train our models on N LI train for 10 epochs, using a learning rate of 1e − 5, a weight decay of 0.01, a batch size of 16, and a maximum sequence length 175. 3 We selected these hyperparameters to be similar to those which were previously reported to yield strong results when training on NLI datasets (Laverghetta Jr. et al, 2021). We additionally evaluated the models on N LI dev , and found that they all achieved a Matthews Correlation of at least 0.6 (Matthews, 1975), and thus concluded that these hyperparameters were suitable.…”
Section: Methodsmentioning
confidence: 96%
“…We train our models on N LI train for 10 epochs, using a learning rate of 1e − 5, a weight decay of 0.01, a batch size of 16, and a maximum sequence length 175. 3 We selected these hyperparameters to be similar to those which were previously reported to yield strong results when training on NLI datasets (Laverghetta Jr. et al, 2021). We additionally evaluated the models on N LI dev , and found that they all achieved a Correlation of at least 0.6 (Matthews, 1975), and thus concluded that these hyperparameters were suitable.…”
Section: Methodsmentioning
confidence: 96%
“…Negation is an important construct in language for reasoning over the truth of propositions (Heinemann, 2015), garnering interest from philosophy (Horn, 1989), psycholinguistics (Zwaan, 2012), and natural language processing (NLP) (Morante and Blanco, 2020). While transformer language models (TLMs) (Vaswani et al, 2017) have achieved impressive performance across many NLP tasks, a great deal of recent work has found that they do not process negation well, and often make predictions that would be trivially false in the eyes of a human (Rogers et al, 2020;Ettinger, 2020;Laverghetta Jr. et al, 2021).…”
Section: Introductionmentioning
confidence: 99%