Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in
            Multiple Languages without Manual Training Data

Pasini, Tommaso; Navigli, Roberto

doi:10.18653/v1/d17-1008

Cited by 32 publications

(27 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Supervised models have been shown to consistently outperform knowledge-based ones in all standard benchmarks (Raganato et al, 2017), at the expense, however, of harder training and limited flexibility. First of all, obtaining reliable sense-annotated corpora is highly expensive and especially difficult when non-expert annotators are involved (de Lacalle and Agirre, 2015), and as a consequence approaches based on unlabeled data and semisupervised learning are emerging (Taghipour and Ng, 2015b;Başkaya and Jurgens, 2016;Yuan et al, 2016;Pasini and Navigli, 2017).…”

Section: Introductionmentioning

confidence: 99%

Neural Sequence Learning Models for Word Sense Disambiguation

Raganato¹,

Bovi²,

Navigli³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

172

180

View full text Add to dashboard Cite

Word Sense Disambiguation models exist in many flavors. Even though supervised ones tend to perform best in terms of accuracy, they often lose ground to more flexible knowledge-based solutions, which do not require training by a word expert for every disambiguation target. To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long Short-Term Memory to encoder-decoder models. Our extensive evaluation over standard benchmarks and in multiple languages shows that sequence learning enables more versatile all-words models that consistently lead to state-of-the-art results, even against word experts with engineered features.

show abstract

Section: Introductionmentioning

confidence: 99%

Neural Sequence Learning Models for Word Sense Disambiguation

Raganato¹,

Bovi²,

Navigli³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

172

180

View full text Add to dashboard Cite

show abstract

“…Baseline Methods: The baselines include some state-of-the-art approaches, i.e., MFS (to directly output the Most Frequent Sense in WordNet); IMS (Zhi and Ng, 2010), a classifier working on several handcrafted features, i.e., POS, surrounding words and local collocations; Babelfy (Moro, Raganato, and Navigli, 2014), a state-of-the-art knowledge-based WSD system exploiting random walks to connect synsets and text fragments; Lesk ext+emb (Basile, Caputo, and Semeraro, 2014a), an extension of Lesk by incorporating similarity information of definitions; UKB gloss (Agirre and Soroa, 2009;Agirre, de Lacalle, and Soroa, 2014), another graph-based method for WSD; A joint learning model for WSD and entity linking (EL) utilizing semantic resources by Weissenborn et al (2015); IMS-s+emb (Iacobacci, , the combination of original IMS and word embeddings through exponential decay while surrounding words are removed from features; Context2vec (Melamud, Goldberger, and Dagan, 2016), a generic model for generating representation of context for WSD; Jointly training LSTM with labeled and unlabeled data (Le, Postma, and Urbani, 2017) data, which is roughly equal to the size of unlabeled corpus in our work. This makes the comparison more fair); A model jointly learns to predict word senses, POS and coarse-grained semantic labels by Raganato, Bovi, and Navigli (2017); Train-O-Matic (Pasini and Navigli, 2017), a language-independent approach for generating sense-labeled data automatically based on random walk in WordNet and training a classifier on it. Datasets: We choose Semcor 3.0 (Miller et al, 1994) (226,036 manual sense annotations), which is also used by baselines, as the manually labeled data.…”

Section: Setupmentioning

confidence: 99%

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

Yin

Zhou

et al. 2019

2019 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning. First, from WordNet, we automatically construct a semantic knowledge base called DisDict, which provides refined feature words that highlight the differences among word senses, i.e., synsets. Second, we automatically generate new sense-labeled data by DisDict from unlabeled corpora. Third, these generated data, together with manually labeled data and unlabeled data, are fed to a neural framework conducting supervised and unsupervised learning jointly to model the semantic relations among synsets, feature words and their contexts. The experimental results show that KDSL outperforms several representative state-of-the-art methods on various major benchmarks. Interestingly, it performs relatively well even when manually labeled data is unavailable, thus provides a potential solution for similar tasks in a lack of manual annotations.

show abstract

“…With respect to universal excessive dependence on external resources, Pasini [10] proposed a multilingual disambiguation system that does not use manual annotated training data. Panchenko [11] also proposed an unsupervised disambiguation method without the use of external knowledge.…”

Section: State Of the Artmentioning

confidence: 99%

Disambiguation of Biomedical Acronyms Based on a Bidirectional Recurrent Neural Network of Character-level Features

Ren¹,

Na²,

Xiong³

et al. 2019

JESTR

View full text Add to dashboard Cite

Polysemic acronyms are very common in the field of biomedicine. These acronyms have different senses in different contexts. The ambiguity of acronyms may cause significant negative impact on the understanding of the full text by machine learning. To address the disambiguation of acronyms in the biomedical domain, most associated studies are based on methods using word-level contextual features. These methods require abundant relevant external resources for model training, and the accuracy of their disambiguation of acronyms may decrease greatly upon the lack of external resources. In this study, disambiguation of biomedical acronyms was investigated on the basis of the character-level feature model to realize the disambiguation of biomedical acronyms with largely limited external corpora. First, sentences containing ambiguous acronyms were extracted through retrieval and the feature vector of the context were initialized by using the character-level features. Second, these initial vectors were input into the bidirectional long shortterm memory neutral network model for training. Finally, the disambiguation of acronyms was realized by the outputs of the neutral network model through the Softmax classification approach. The results of acronym disambiguation based on character-level feature model were also compared with those based on word-level feature models. Results demonstrate that the average accuracy of the character-level feature neutral network algorithm reaches 85.82% on the dataset of 106 common biomedical acronyms. Thus, the character-level feature neutral network algorithm is superior to the traditional methods, which use a large number of external resources. This study confirms that the disambiguation method based on character-level features is applicable to the disambiguation of biomedical acronyms under limited relevant data.

show abstract

Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data

Cited by 32 publications

References 25 publications

Neural Sequence Learning Models for Word Sense Disambiguation

Neural Sequence Learning Models for Word Sense Disambiguation

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

Disambiguation of Biomedical Acronyms Based on a Bidirectional Recurrent Neural Network of Character-level Features

Contact Info

Product

Resources

About