2022
DOI: 10.31234/osf.io/gdm5v
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Lexical Hypothesis: Identifying personality structure in natural language

Abstract: Recent advances in natural language processing (NLP) have produced general models that can perform complex tasks such as summarizing long passages and translating across languages. Here, we introduce a method to extract adjective similarities from language models as done with survey-based ratings in traditional psycholexical studies but using millions of times more text in a natural setting. The correlational structure produced through this method is highly similar to that of self- and other-ratings of 435 ter… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 99 publications
(145 reference statements)
1
13
0
Order By: Relevance
“…The primary contribution of this project is to make the books and search results available for researchers, especially personality psychologists, who seek information about the frequency of usage for person-descriptive terms. Most prominently, these data can contribute to the long arc of psycholexical research that began in earnest in 1936 (Allport & Odbert) and remains ongoing (Cutler & Condon, 2022). Prior work in this area has led to the identification of several multidimensional structural models of personality, including the Big Five (Goldberg, 1992), the HEXACO (Ashton et al, 2004) and the High Dimensional 20 (Saucier & Iurino, 2020).…”
Section: Discussionmentioning
confidence: 95%
See 3 more Smart Citations
“…The primary contribution of this project is to make the books and search results available for researchers, especially personality psychologists, who seek information about the frequency of usage for person-descriptive terms. Most prominently, these data can contribute to the long arc of psycholexical research that began in earnest in 1936 (Allport & Odbert) and remains ongoing (Cutler & Condon, 2022). Prior work in this area has led to the identification of several multidimensional structural models of personality, including the Big Five (Goldberg, 1992), the HEXACO (Ashton et al, 2004) and the High Dimensional 20 (Saucier & Iurino, 2020).…”
Section: Discussionmentioning
confidence: 95%
“…The need to characterize longer lists of terms stems from the potential to use language models (aka natural language processing techniques) in personality structure research (Cutler & Condon, 2022;Jackson et al, 2021). These models include, for example, transformer-encoder models such as BERT (Devlin et al, 2019), DeBERTa (He et al, 2021), and GPT-3 (Brown et al, 2020).…”
Section: Frequency Of Use Metrics For Person Descriptorsmentioning
confidence: 99%
See 2 more Smart Citations
“…Though the approaches clearly differ, NLP and survey-based methods are similar in that the underlying data are contributed by individual people who will influence the resulting structure in proportion to the total informational content of the full sample of contributors. Thurstone (1934) was developing, in essence, a type of language model when instructing respondents to rate themselves using an adjective checklist, as this data-generating process is functionally equivalent to the generation of a statement about one’s personality (Cutler, 2021; Cutler et al, 2021). A benefit of Thurstone’s approach is that the resulting data are highly structured, but the trade-off is substantial—the pool of respondents is limited to those who can be recruited and motivated to complete the checklist.…”
Section: Traditional and Nlp Approaches To Psycholexical Researchmentioning
confidence: 99%