2019
DOI: 10.1109/access.2019.2938058
|View full text |Cite
|
Sign up to set email alerts
|

ShotgunWSD 2.0: An Improved Algorithm for Global Word Sense Disambiguation

Abstract: ShotgunWSD is a recent unsupervised and knowledge-based algorithm for global word sense disambiguation (WSD). The algorithm is inspired by the Shotgun sequencing technique, which is a broadly-used whole genome sequencing approach. ShotgunWSD performs WSD at the document level based on three phases. The first phase consists of applying a brute-force WSD algorithm on short context windows selected from the document, in order to generate a short list of likely sense configurations for each window. The second phas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 43 publications
0
10
0
Order By: Relevance
“…After the preliminary work of Bengio et al (2003) and Schütze (1993), various improvements have been made in the quality of the embedding and the training time (Collobert and Weston 2008;Mikolov et al 2013a,b;Pennington, Socher, and Manning 2014), while some efforts have been directed towards learning multiple representations for polysemous words (Huang et al 2012;Reisinger and Mooney 2010;Tian et al 2014). These improvements, and many others not mentioned here, have been extensively used in various NLP tasks (Butnaru and Ionescu 2019b;Garg et al 2018;Glorot, Bordes, and Bengio 2011;Ionescu and Butnaru 2019;Musto et al 2016;Weston, Bengio, and Usunier 2011;Yang, Macdonald, and Ounis 2018).…”
Section: Data Representationsmentioning
confidence: 84%
“…After the preliminary work of Bengio et al (2003) and Schütze (1993), various improvements have been made in the quality of the embedding and the training time (Collobert and Weston 2008;Mikolov et al 2013a,b;Pennington, Socher, and Manning 2014), while some efforts have been directed towards learning multiple representations for polysemous words (Huang et al 2012;Reisinger and Mooney 2010;Tian et al 2014). These improvements, and many others not mentioned here, have been extensively used in various NLP tasks (Butnaru and Ionescu 2019b;Garg et al 2018;Glorot, Bordes, and Bengio 2011;Ionescu and Butnaru 2019;Musto et al 2016;Weston, Bengio, and Usunier 2011;Yang, Macdonald, and Ounis 2018).…”
Section: Data Representationsmentioning
confidence: 84%
“…Hence, performance is largely dependent on data quality. On the other hand, a knowledge-based approach [3], [5], [19], [28], [32] makes use of an external lexical knowledge base, such as such as 1 WordNet [26] or 2 BabelNet [29], to obtain the sense of an ambiguous word corresponding to its context. The resources generally include sets of words with lexical synonyms grouped into a set, and each set is linked in accordance with the lexical and semantic connections.…”
Section: Related Workmentioning
confidence: 99%
“…This section introduces a new large data set for WSD that is automatically collected from the publicly accessed 3 Oxford Dictionary. Its primary purpose is to learn a versatile WSD model that can address a wide range of topics, from real-life conversations to machine translation, second language education, medical information retrieval, and more.…”
Section: Oxford Dataset For Word Sense Disambiguationmentioning
confidence: 99%
“…The brute force algorithm is applied on short context window in first phase. In second phase, the local sense configurations are assembled by the prefix and suffix matching into the composite configurations where resulting configurations are ranked and sense of each word is detected based on majority voting as on [14]. WSD is a very common problem in the field of natural language processing (NLP).…”
Section: Related Workmentioning
confidence: 99%
“…The Induction technique has been used in first phase of this system where second phase is Word Sense Disambiguation which is developed by the use of Semantic Similarity Measure. ShotgunWSD [25] is a recent algorithm of The Global word sense disambiguation (WSD) which is unsupervised and knowledge-based algorithm. The algorithm has been developed from the Shotgun sequencing technique.…”
Section: Related Workmentioning
confidence: 99%