Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.61
|View full text |Cite
|
Sign up to set email alerts
|

AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER

Abstract: Neural methods have been shown to achieve high performance in Named Entity Recognition (NER), but rely on costly high-quality labeled data for training, which is not always available across languages. While previous works have shown that unlabeled data in a target language can be used to improve crosslingual model performance, we propose a novel adversarial approach (AdvPicker) to better leverage such data and further improve results. We design an adversarial learning framework in which an encoder learns entit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(18 citation statements)
references
References 27 publications
0
17
0
Order By: Relevance
“…Most recent studies focus on the bilingual transfer case and can be grouped into three categories: (1) Data-based approaches utilize machine translation and label projection (Jain et al 2019;Yang et al 2022) to create pseudo-training data for the target language. (2) Feature-based approaches rely on features alignment to diminish the language shift (Chen et al 2021;Ge et al 2023). (3) The distillation-based methods (Wu et al 2020b;Liang et al 2021;Ma et al 2022) enable the student network to gain task knowledge from soft labels predicted by the teacher network on the target language.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Most recent studies focus on the bilingual transfer case and can be grouped into three categories: (1) Data-based approaches utilize machine translation and label projection (Jain et al 2019;Yang et al 2022) to create pseudo-training data for the target language. (2) Feature-based approaches rely on features alignment to diminish the language shift (Chen et al 2021;Ge et al 2023). (3) The distillation-based methods (Wu et al 2020b;Liang et al 2021;Ma et al 2022) enable the student network to gain task knowledge from soft labels predicted by the teacher network on the target language.…”
Section: Related Workmentioning
confidence: 99%
“…Recent methods, driven by the weighted combining rule, learn language-specific classifiers and obtain a weighted ensemble prediction for target samples. For instance, MulTS (Chen et al 2021) trains one specific network for each source language separately to derive the combined final prediction, which causes a substantial computational burden. MAN (Chen et al 2019) and G-MOE (Jin et al 2022) force multiple source languages to share one same encoder, avoiding the huge parameter computation problem.…”
Section: Related Workmentioning
confidence: 99%
“…Extensive works [29][30][31][32] have shown that dual-branch network can not only reduce computational cost but also have the potential to improve task performance. Chen et al [33] proposed a dual-branch adversarial learning framework in the task of named entity recognition, where one branch learned entity domain knowledge, and the other branch captured better features, the two aspects worked together to improve the results. Chen et al [34] combined a Bayesian network based on medical knowledge graph augmentation with a multibranch entity-aware convolutional neural network.…”
Section: Related Workmentioning
confidence: 99%
“…In other words, the adversarial techniques encourage the model to align the representations of English documents and their translations of target languages. Furthermore, the adversarial learning framework can leverage the unlabeled data and improve the result [135], where a discriminator picks less language-dependent target language data according to the similarity to the source language. • Multilingual Pretrained Language Model.…”
Section: Future Directionmentioning
confidence: 99%