Proceedings of the Workshop on Multiword Expressions Identifying and Exploiting Underlying Properties - MWE '06 2006
DOI: 10.3115/1613692.1613696
|View full text |Cite
|
Sign up to set email alerts
|

Automatic identification of non-compositional multi-word expressions using latent semantic analysis

Abstract: Making use of latent semantic analysis, we explore the hypothesis that local linguistic context can serve to identify multi-word expressions that have noncompositional meanings. We propose that vector-similarity between distribution vectors associated with an MWE as a whole and those associated with its constitutent parts can serve as a good measure of the degree to which the MWE is compositional. We present experiments that show that low (cosine) similarity does, in fact, correlate with non-compositionality.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
82
1

Year Published

2009
2009
2018
2018

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 86 publications
(83 citation statements)
references
References 14 publications
0
82
1
Order By: Relevance
“…The few token-based approaches include a study by Katz and Giesbrecht (2006), who devise a supervised method in which they compute the meaning vectors for the literal and non-literal usages of a given expression in the training data. An unseen test instance of the same expression is then labelled by performing a nearest neighbour classification.…”
Section: Related Workmentioning
confidence: 99%
“…The few token-based approaches include a study by Katz and Giesbrecht (2006), who devise a supervised method in which they compute the meaning vectors for the literal and non-literal usages of a given expression in the training data. An unseen test instance of the same expression is then labelled by performing a nearest neighbour classification.…”
Section: Related Workmentioning
confidence: 99%
“…Baldwin et al (2003) showed that LSA-based similarity between the multiword expression and each of its components is indicative for compositionality. Katz and Giesbrecht (2006) compared the actual phrase vector to the estimated compositional meaning vector calculated as a sum of the meaning vectors of the 772 parts. The hypothesis was that the similarity between these two vectors should be larger in case of phrases which are not used non-compositionally.…”
Section: Existing Workmentioning
confidence: 99%
“…Several researchers have used latent semantic analysis (LSA) to distinguish between compositional and non-compositional preferences of expressions [17,18]. They show that compositional MWEs are generally more likely similar to their constituents than other non-compositional MWEs.…”
Section: Mwe Identificationmentioning
confidence: 99%