Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.95
|View full text |Cite
|
Sign up to set email alerts
|

Local Additivity Based Data Augmentation for Semi-supervised NER

Abstract: Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on humanannotated data. In this work, to alleviate the dependence on labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, in which we create virtual samples by interpolating sequences close to each other. Our approach has two variations: Intra-LADA and Inter-LADA, where Intra-LADA performs interpolations among tokens within one senten… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(28 citation statements)
references
References 28 publications
0
28
0
Order By: Relevance
“…They test its performance on text classification datasets. Chen et al [79] introduce Mixup into NER, proposing both Intra-LADA and InterLADA.…”
Section: Mixupmentioning
confidence: 99%
“…They test its performance on text classification datasets. Chen et al [79] introduce Mixup into NER, proposing both Intra-LADA and InterLADA.…”
Section: Mixupmentioning
confidence: 99%
“…Soft data augmentation. In addition to explicit generation of concrete examples, soft augmentation, which directly represents generated examples in a continuous vector space, has been proposed: Gao et al (2019) propose to perform soft word substitution for machine translation; recent work has adapted the mix-up method (Zhang et al, 2018), which augments the original dataset by linearly interpolating the vector representations of text and labels, to text classification (Guo et al, 2019;Sun et al, 2020), named entity recognition (Chen et al, 2020) and compositional generalization (Guo et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…To alleviate the data-sparsity issue, various advanced techniques have emerged, such as transfer learning (Pan and Yang, 2009), semi-supervised learning (Mishra and Diesner, 2016;He and Sun, 2017;Wang et al, 2020b;Bhattacharjee et al, 2020;Chen et al, 2020b), domain adaptation (Li et al, , 2019b, and data augmentation (Dai and Adel, 2020;Chen et al, 2020a;Ding et al, 2020). Considering the multilingual setting, data augmentation may be infeasible and could bring in external knowledge errors.…”
Section: Multilingual Sequence Labelingmentioning
confidence: 99%