“…However, there are multiple ways to improve the results as our models do not incorporate much in terms of cross-lingual signal. In the future, it would be worth integrating this cross-lingual signal in the form of pretrained cross-lingual word embeddings (Artetxe et al, 2016;Lample et al, 2018) or sub-word, e.g., character, embeddings (Chaudhary et al, 2018;Sofroniev and Çöltekin, 2018), as this could lead to better generalization to new languages. Similarly, typological distance between source and target language often correlates with performance (Cotterell and Heigold, 2017), which could be exploited for weighting the contribution of source-language examples when learning a multilingual model.…”