Proceedings of the ACL-02 Workshop on Word Sense Disambiguation Recent Successes and Future Directions - 2002
DOI: 10.3115/1118675.1118683
|View full text |Cite
|
Sign up to set email alerts
|

Sense discrimination with parallel corpora

Abstract: This paper describes an experiment that uses translation equivalents derived from parallel corpora to determine sense distinctions that can be used for automatic sense-tagging and other disambiguation tasks. Our results show that sense distinctions derived from cross-lingual information are at least as reliable as those made by human annotators. Because our approach is fully automated through all its steps, it could provide means to obtain large samples of "sense-tagged" data without the high cost of human ann… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0
1

Year Published

2004
2004
2014
2014

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 48 publications
(30 citation statements)
references
References 9 publications
0
29
0
1
Order By: Relevance
“…[16,17]), as well as on parallel corpora. Previous research on parallel corpora [18,7] confirmed the use of cross-lingual lexicalization as a criterion for performing sense discrimination. Whereas in previous research on cross-lingual WSD the evidence from the aligned sentences was mainly used to enrich WordNet information, our approach does not require any external resources.…”
Section: Feature Vectorsmentioning
confidence: 83%
See 1 more Smart Citation
“…[16,17]), as well as on parallel corpora. Previous research on parallel corpora [18,7] confirmed the use of cross-lingual lexicalization as a criterion for performing sense discrimination. Whereas in previous research on cross-lingual WSD the evidence from the aligned sentences was mainly used to enrich WordNet information, our approach does not require any external resources.…”
Section: Feature Vectorsmentioning
confidence: 83%
“…Using a parallel corpus, such as for example Europarl, instead of human defined sense-labels offers some advantages: (1) for most languages we do not have large sense-annotated corpora or sense inventories, (2) using corpus translations should make it easier to integrate the WSD module into real multilingual applications and (3) this approach implicitly deals with the granularity problem, as fine sense distinctions (that are often listed in electronic sense inventories) are only relevant in case they get lexicalized in the target translations. The idea to use translations from parallel corpora to distinguish between word senses is based on the hypothesis that different meanings of a polysemous word are lexicalized across languages [6,7]. Many WSD studies have already shown the validity of this cross-lingual evidence idea.…”
Section: Introductionmentioning
confidence: 99%
“…One is the use of parallel corpora for unsupervised word sense disambiguation. While parallel texts have been the foundation on which modern statistical machine translation now stands [5,6], their utility extends to semantically related tasks as well, such as inducing bilingual dictionaries [22] and translating collocations [39]. Their use in word sense disambiguation [21] rivals or surpasses other unsupervised methods, and they have been used with success with translation pairs in English/French [13], English/Chinese [8,31], English/Portuguese [40], and English/Vietnamese [14].…”
Section: Related Workmentioning
confidence: 99%
“…In more recent years, Ide et al (2002) present a method to identify word meanings starting from a multilingual corpus. A by-product of applying this method is that once a word in one language is word-sense tagged, the translation equivalents in the parallel texts are also automatically annotated.…”
Section: Related Workmentioning
confidence: 99%