Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP 2016
DOI: 10.18653/v1/w16-2518
|View full text |Cite
|
Sign up to set email alerts
|

Thematic fit evaluation: an aspect of selectional preferences

Abstract: In this paper, we discuss the human thematic fit judgement correlation task in the context of real-valued vector space word representations. Thematic fit is the extent to which an argument fulfils the selectional preference of a verb given a role: for example, how well "cake" fulfils the patient role of "cut". In recent work, systems have been evaluated on this task by finding the correlations of their output judgements with human-collected judgement data. This task is a representationindependent way of evalua… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 15 publications
0
12
1
Order By: Relevance
“…Differently from other distributional semantic models tested on the thematic fit task, ‘structure’ is now externally encoded in a graph, whose nodes are embeddings, and not directly in the dimension of the embeddings themselves. The fact that the best performing word embeddings in our framework are the Skip-Gram ones is somewhat surprising, and against the finding of previous literature in which bag-of-words models were always described as struggling on this task (Baroni et al 2014; Sayeed et al 2016). Given our results, we also suggested that the dimensionality of the embeddings could be an important factor, much more than the choice of training them on syntactic contexts.…”
Section: Resultscontrasting
confidence: 54%
See 1 more Smart Citation
“…Differently from other distributional semantic models tested on the thematic fit task, ‘structure’ is now externally encoded in a graph, whose nodes are embeddings, and not directly in the dimension of the embeddings themselves. The fact that the best performing word embeddings in our framework are the Skip-Gram ones is somewhat surprising, and against the finding of previous literature in which bag-of-words models were always described as struggling on this task (Baroni et al 2014; Sayeed et al 2016). Given our results, we also suggested that the dimensionality of the embeddings could be an important factor, much more than the choice of training them on syntactic contexts.…”
Section: Resultscontrasting
confidence: 54%
“…Another constant finding of previous studies on thematic fit modelling was that high-dimensional, count-based vector representations perform generally better than dense word embeddings, to the point that Sayeed et al (2016) stressed the sensitivity of this task to linguistic detail and to the interpretability of the vector space. Therefore, we tested whether vector dimensionality had an impact on task performance (Table 4).…”
Section: Resultsmentioning
confidence: 97%
“…For example, given the agent butcher, the expected patient of the verb cut is likely to be meat, whereas given the agent coiffeur, the expected patient is likely to be hair. For other research on this topic, see Sayeed et al (2016) and Tilk et al (2016).…”
Section: Beyond the Lexicon: Compositional Distributional Semanticsmentioning
confidence: 99%
“…The similarities are calculated using a Distributional Memory approach similar to that of Baroni and Lenci (2010). Their structured vector space representation has been shown to work well on tasks that evaluate correlation with human thematic fit estimates (Baroni and Lenci, 2010;Baroni et al, 2014;Sayeed et al, 2016) and is thus suited to our task.…”
Section: Selectional Preferences Featurementioning
confidence: 99%