Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1 2017
DOI: 10.18653/v1/e17-1102
|View full text |Cite
|
Sign up to set email alerts
|

Bilingual Lexicon Induction by Learning to Combine Word-Level and Character-Level Representations

Abstract: We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology. We frame BLI as a classification problem for which we design a neural network based classification architecture composed of recurrent long short-term memory and deep feed forward networks. The results show that word-and character-level representations each improve state-of-the-art results for BLI, and the be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
51
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
1
1

Relationship

3
7

Authors

Journals

citations
Cited by 29 publications
(54 citation statements)
references
References 31 publications
3
51
0
Order By: Relevance
“…Similarly, the bootstrapping technique developed for traditional context-counting approaches (Peirsman & Padó, 2010;Vulić & Moens, 2013b) is an important predecessor to recent iterative self-learning techniques used to limit the bilingual dictionary seed supervision needed in mapping-based approaches (Hauer, Nicolai, & Kondrak, 2017;?). The idea of CCA-based word embedding learning (see later in Section 6) (Faruqui & Dyer, 2014b;Lu, Wang, Bansal, Gimpel, & Livescu, 2015) was also introduced a decade earlier (Haghighi, Liang, Berg-Kirkpatrick, & Klein, 2008); their word additionally discussed the idea of combining orthographic subword features with distributional signatures for cross-lingual representation learning: This idea re-entered the literature recently (Heyman, Vulić, & Moens, 2017), only now with much better performance.…”
Section: A Brief History Of Cross-lingual Word Representationsmentioning
confidence: 99%
“…Similarly, the bootstrapping technique developed for traditional context-counting approaches (Peirsman & Padó, 2010;Vulić & Moens, 2013b) is an important predecessor to recent iterative self-learning techniques used to limit the bilingual dictionary seed supervision needed in mapping-based approaches (Hauer, Nicolai, & Kondrak, 2017;?). The idea of CCA-based word embedding learning (see later in Section 6) (Faruqui & Dyer, 2014b;Lu, Wang, Bansal, Gimpel, & Livescu, 2015) was also introduced a decade earlier (Haghighi, Liang, Berg-Kirkpatrick, & Klein, 2008); their word additionally discussed the idea of combining orthographic subword features with distributional signatures for cross-lingual representation learning: This idea re-entered the literature recently (Heyman, Vulić, & Moens, 2017), only now with much better performance.…”
Section: A Brief History Of Cross-lingual Word Representationsmentioning
confidence: 99%
“…Furthermore, the model that combines character-level information and word-level information outperforms other baselines (including BWESG, the strongest word-level model) by a margin. For more details, see Reference [58].…”
Section: Combining Word-level and Character-level Representationsmentioning
confidence: 99%
“…Especially BWEs obtained using post-hoc mapping (Mikolov et al, 2013b;Lazaridou et al, 2015) fail on this task. Consequently, Heyman et al (2017) build BWEs using aligned documents and then engineer a specialized classification-based approach to BLI. In contrast, our delightfully simple approach to create high-quality BWEs for the medical domain requires only monolingual data.…”
Section: Bilingual Lexicon Induction (Bli)mentioning
confidence: 99%