2018
DOI: 10.5334/joc.50
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Lexical Norms: A Comparison between a Word Association Model and Text-Based Word Co-occurrence Models

Abstract: In two studies we compare a distributional semantic model derived from word co-occurrences and a word association based model in their ability to predict properties that affect lexical processing. We focus on age of acquisition, concreteness, and three affective variables, namely valence, arousal, and dominance, since all these variables have been shown to be fundamental in word meaning. In both studies we use a model based on data obtained in a continued free word association task to predict these variables. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

6
31
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 40 publications
(40 citation statements)
references
References 55 publications
6
31
0
Order By: Relevance
“…The specialization method involved with the popular Paragram-SL999 vectors(Wieting et al, 2015) may have actually degraded performance in the present setting. As we observe similar results in the following sections, we save interpretation of the relative performance of specialized and unspecialized text-based vectors for the General Discussion.Finally, considering the best similarity functions for a given representation, vectors based on free association norms presented a small advantages over vectors based on text, an effect consistent with previous work comparing these two sources of representation when modeling semantic judgments(De Deyne et al, 2015;Vankrunkelsven et al, 2018). Even more striking was that Spearman correlation, when combined with SWOW-RW-SVD vectors, achieved the best overall performance (r = .61), .04 points better than the next best representation-function pair (SWOW-RW with Pearson correlation), and .08 points better than the best (text-based vector, function) pair.…”
supporting
confidence: 82%
See 2 more Smart Citations
“…The specialization method involved with the popular Paragram-SL999 vectors(Wieting et al, 2015) may have actually degraded performance in the present setting. As we observe similar results in the following sections, we save interpretation of the relative performance of specialized and unspecialized text-based vectors for the General Discussion.Finally, considering the best similarity functions for a given representation, vectors based on free association norms presented a small advantages over vectors based on text, an effect consistent with previous work comparing these two sources of representation when modeling semantic judgments(De Deyne et al, 2015;Vankrunkelsven et al, 2018). Even more striking was that Spearman correlation, when combined with SWOW-RW-SVD vectors, achieved the best overall performance (r = .61), .04 points better than the next best representation-function pair (SWOW-RW with Pearson correlation), and .08 points better than the best (text-based vector, function) pair.…”
supporting
confidence: 82%
“…Pearson correlations between these predictions and actual judgments were substantially higher (advantages of ~ r =.1 to r =.26) than correlations between actual judgments and predictions obtained from cosine similarity between vectors from a model based on word-word cooccurrences in syntactic dependencies (e.g., noun-verb dependencies like "We need some more coffee"). Similarly, Vankrunkelsven, Verheyen, Storms, & De Deyne (2018) found that affective word properties (e.g., valence and arousal) of English and Dutch words were better predicted from k-Nearest Neighbors regression of such a PPMI-transformed cue-response matrix than they were from k-NN regression of a similar syntactic-dependency-based text model. One of the goals of the present work, therefore, is to examine whether representations from free association norms make better predictions of similarity judgments than do various representations based on lexical co-occurrence statistics in text, including several not considered by De Deyne et al 2015or Vankrunkelsven et al (2018).…”
Section: Representationsmentioning
confidence: 86%
See 1 more Smart Citation
“…Word association data display assortativity for valence, arousal, and dominance: cues of a particular affective quality tend to elicit responses with a similar affective quality (Pollio, 1964;Staats & Staats, 1959;Van Rensbergen, Storms, & De Deyne, 2015b). Accurate predictions of words' standings on all three affective dimensions can also be obtained from word association data (Vankrunkelsven, Verheyen, Storms, & De Deyne, 2018;Van Rensbergen, De Deyne, & Storms, 2015a). Therefore, word association data have the potential to uncover the extent to which there are systematic relationships between the manner in which words are organized in the mental lexicon and the words' affective dimensions, which have been claimed to be an integral part of the stored word meaning (Osgood, Suci, & Tannenbaum, 1957;Samsonovich & Ascoli, 2010).…”
Section: Applicationmentioning
confidence: 99%
“…Still, taking affective information into account might not suffice to capture representation of intangible abstracta. In this regard, recent multimodal models suggest that supplementing affective information with information related to the statistical distribution of concepts in language (i.e., distributional models of semantic representation; Landauer & Dumais 1997) drastically improves prediction of human affective judgments (Bestgen & Vincze 2012;Recchia & Louwerse 2015;Vankrunkelsven et al 2018). More importantly, recent work by Lenci et al (2018) reveals a strong link between distributional statistics and emotion: intangible representations have more affective content and tend to co-occur with contexts with higher emotive value.…”
Section: Prospection Does Not Imply Predictive Processingmentioning
confidence: 99%