Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1092
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank

Abstract: Treebank translation is a promising method for cross-lingual transfer of syntactic dependency knowledge. The basic idea is to map dependency arcs from a source treebank to its target translation according to word alignments. This method, however, can suffer from imperfect alignment between source and target words. To address this problem, we investigate syntactic transfer by code mixing, translating only confident words in a source treebank. Cross-lingual word embeddings are leveraged for transferring syntacti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 39 publications
(22 citation statements)
references
References 24 publications
0
22
0
Order By: Relevance
“…Sources for parallel text can be the OPUS project (Tiedemann, 2012), Bible corpora (Mayer and Cysouw, 2014;Christodoulopoulos and Steedman, 2015) or the recent JW300 corpus (Agić and Vulić, 2019). Instead of using parallel corpora, existing high-resource labeled datasets can also be machine-translated into the low-resource language (Khalil et al, 2019;Zhang et al, 2019a;Fei et al, 2020;Amjad et al, 2020). Cross-lingual projections have even been used with English as a target language for detecting linguistic phenomena like modal sense and telicity that are easier to identify in a different language (Zhou et al, 2015;Marasović et al, 2016;Friedrich and Gateva, 2017).…”
Section: Cross-lingual Annotation Projectionsmentioning
confidence: 99%
“…Sources for parallel text can be the OPUS project (Tiedemann, 2012), Bible corpora (Mayer and Cysouw, 2014;Christodoulopoulos and Steedman, 2015) or the recent JW300 corpus (Agić and Vulić, 2019). Instead of using parallel corpora, existing high-resource labeled datasets can also be machine-translated into the low-resource language (Khalil et al, 2019;Zhang et al, 2019a;Fei et al, 2020;Amjad et al, 2020). Cross-lingual projections have even been used with English as a target language for detecting linguistic phenomena like modal sense and telicity that are easier to identify in a different language (Zhou et al, 2015;Marasović et al, 2016;Friedrich and Gateva, 2017).…”
Section: Cross-lingual Annotation Projectionsmentioning
confidence: 99%
“…In contrast, our framework can augment data dynamically in each epoch to encourage the model to align the representation in different languages, and can generate multiple languages code-switch data making training once and directly testing for all languages multiple times. Zhang et al [2019] proposed using code-mixing to perform the syntactic transfer in dependency parsing. However, they need a high-accuracy translator to obtain multiple language data which can be difficult to train for low-resource language.…”
Section: Related Workmentioning
confidence: 99%
“…Kim and Hovy (2006a) exploit a semantic role labeller to extract opinion holders and topics. Choi et al (2005) Cross-Lingual Transfer Learning Crosslingual transfer learning has been extensively applied in NLP, including sentiment classification (Zhou et al, 2016), POS tagging Wisniewski et al, 2014;Kim et al, 2017), named entity recognition (Zirikly and Hagiwara, 2015), semantic role labeling (Fei et al, 2020), and dependency parsing (McDonald et al, 2011;Tiedemann et al, 2014;Guo et al, 2016;Zhang et al, 2019b). Unsupervised cross-lingual transferring has received great interest (Duong et al, 2015;Xu et al, 2018), which is our major focus.…”
Section: Related Workmentioning
confidence: 99%
“…Unsupervised cross-lingual transfer (Xu et al, 2018) is one promising way to address the low resource problem for ORL. Under the neural setting, there are two representative categories of methods: model transfer (McDonald et al, 2013;Swayamdipta et al, 2016;Daza and Frank, 2019) and corpus translation (Zhang et al, 2019b). The model transfer trains a model on a resource-rich language by using only language-independent features such as multilingual BERT (Devlin et al, 2018;Pires et al, 2019) and then apply it to the target language.…”
Section: Introductionmentioning
confidence: 99%