Proceedings of Workshop for NLP Open Source Software (NLP-OSS) 2018
DOI: 10.18653/v1/w18-2505
|View full text |Cite
|
Sign up to set email alerts
|

The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD

Abstract: UKB is an open source collection of programs for performing, among other tasks, knowledge-based Word Sense Disambiguation (WSD). Since it was released in 2009 it has been often used out-of-thebox in sub-optimal settings. We show that nine years later it is the state-of-the-art on knowledge-based WSD. This case shows the pitfalls of releasing open source NLP software without optimal default settings and precise instructions for reproducibility.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(32 citation statements)
references
References 10 publications
0
32
0
Order By: Relevance
“…Even though Babelfy is based on BabelNet, it does not make direct use of the translation information in BabelNet. Similarly, UKB (Agirre et al, 2014(Agirre et al, , 2018, which is based on personalized PageRank on WordNet, achieves state-of-the-art performance on English all-words WSD. Finally, utilizing contextual embeddings, SENSEMBERT learns knowledge-based multilingual sense embeddings obtained by combining representations learned using BERT with knowledge obtained from BabelNet.…”
Section: Wsd Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…Even though Babelfy is based on BabelNet, it does not make direct use of the translation information in BabelNet. Similarly, UKB (Agirre et al, 2014(Agirre et al, , 2018, which is based on personalized PageRank on WordNet, achieves state-of-the-art performance on English all-words WSD. Finally, utilizing contextual embeddings, SENSEMBERT learns knowledge-based multilingual sense embeddings obtained by combining representations learned using BERT with knowledge obtained from BabelNet.…”
Section: Wsd Systemsmentioning
confidence: 99%
“…UKB uses complete sense frequency distributions, which are referred to as the dictionary weight (dict weight). We use the same parameter settings as Agirre et al (2018). For fair comparison, when applying SOFTCONSTRAINT to a system variant without sense frequency information, we set γ to 0 to turn off the p freq component.…”
Section: Oracle Wsd Experimentsmentioning
confidence: 99%
“…As noted in Agirre et al (2018), depending on the particular configuration, it is easy to get a wide range of results using UKB. The configurations used here are based on the recommended configuration given by Agirre et al (2018). For all configurations, the ppr w2w algorithm is used, which runs personalised page rank for each target word.…”
Section: Ukbmentioning
confidence: 99%
“…We have compared our proposal with UKB [1], which is one of the best WSD that can be found in the literature. Further, we have also evaluated the performance of these WSD by using a gold standard manually developed by a human expert.…”
Section: Sensementioning
confidence: 99%
“…Using the reported target distribution in Table 4, the resulting average ratio is 0.5538, from which we can conclude that the baseline performance is 55.35 %. Second, we have disambiguated BLESS using UKB [1], which is a well-known state-ofthe-art tool in WSD, and compare the results with our gold standard. Totally, UKB correctly disambiguates 180 from 200 targets, thus its performance is 90.00 %.…”
Section: Evaluation Frameworkmentioning
confidence: 99%