“…Baseline Methods: The baselines include some state-of-the-art approaches, i.e., MFS (to directly output the Most Frequent Sense in WordNet); IMS (Zhi and Ng, 2010), a classifier working on several handcrafted features, i.e., POS, surrounding words and local collocations; Babelfy (Moro, Raganato, and Navigli, 2014), a state-of-the-art knowledge-based WSD system exploiting random walks to connect synsets and text fragments; Lesk ext+emb (Basile, Caputo, and Semeraro, 2014a), an extension of Lesk by incorporating similarity information of definitions; UKB gloss (Agirre and Soroa, 2009;Agirre, de Lacalle, and Soroa, 2014), another graph-based method for WSD; A joint learning model for WSD and entity linking (EL) utilizing semantic resources by Weissenborn et al (2015); IMS-s+emb (Iacobacci, , the combination of original IMS and word embeddings through exponential decay while surrounding words are removed from features; Context2vec (Melamud, Goldberger, and Dagan, 2016), a generic model for generating representation of context for WSD; Jointly training LSTM with labeled and unlabeled data (Le, Postma, and Urbani, 2017) data, which is roughly equal to the size of unlabeled corpus in our work. This makes the comparison more fair); A model jointly learns to predict word senses, POS and coarse-grained semantic labels by Raganato, Bovi, and Navigli (2017); Train-O-Matic (Pasini and Navigli, 2017), a language-independent approach for generating sense-labeled data automatically based on random walk in WordNet and training a classifier on it. Datasets: We choose Semcor 3.0 (Miller et al, 1994) (226,036 manual sense annotations), which is also used by baselines, as the manually labeled data.…”