2013
DOI: 10.1145/2461316.2461319
|View full text |Cite
|
Sign up to set email alerts
|

Word Sense Disambiguation by Combining Labeled Data Expansion and Semi-Supervised Learning Method

Abstract: Lack of labeled data is one of the severest problems facing word sense disambiguation (WSD). We overcome the problem by proposing a method that combines automatic labeled data expansion (Step 1) and semisupervised learning (Step 2). The Step 1 and 2 methods are both effective, but their combination yields a synergistic effect.In this article, in Step 1, we automatically extract reliable labeled data from raw corpora using dictionary example sentences, even the infrequent and unseen senses (which are not likely… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…Distant supervision is a learning paradigm similar to semi-supervised learning. Unlike semi-supervised methods which typically employ a supervised classifier and a small number of seed instances to do bootstrap learning (Yarowsky, 1995;Mihalcea, 2004;Fujita and Fujino, 2011), in distant supervision training data are created in a single run from scratch by aligning corpus instances with entries in a knowledge base. Distant supervision methods that have used LLRs as knowledge bases have been previously applied in relation extraction, e.g.…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…Distant supervision is a learning paradigm similar to semi-supervised learning. Unlike semi-supervised methods which typically employ a supervised classifier and a small number of seed instances to do bootstrap learning (Yarowsky, 1995;Mihalcea, 2004;Fujita and Fujino, 2011), in distant supervision training data are created in a single run from scratch by aligning corpus instances with entries in a knowledge base. Distant supervision methods that have used LLRs as knowledge bases have been previously applied in relation extraction, e.g.…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…(Taghipour and Ng, 2015) and (Yuan et al, 2016) proposed a semi-supervised WSD method to use word embeddings of surrounding words of the target word and showed that the performance of WSD could be increased by taking advantage of word embeddings. (Fujita et al 2011) proposed a semi-supervised WSD method that automatically obtains reliable sense labelled examples using example sentences from the Iwanami Japanese dictionary to expand the labelled training data. Then, this method employs a maximum entropy model to construct a WSD classifier for each target word using common morphological features (surrounding words and POS tags) and topic features.…”
Section: Related Workmentioning
confidence: 99%
“…Fujita and Fujino [11] proposed a method that provides reliable training data using example sentences from an external dictionary.…”
Section: Data Expansion Techniquementioning
confidence: 99%