Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.691
|View full text |Cite
|
Sign up to set email alerts
|

SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup

Abstract: Active learning is an important technique for low-resource sequence labeling tasks. However, current active sequence labeling methods use the queried samples alone in each iteration, which is an inefficient way of leveraging human annotations. We propose a simple but effective data augmentation method to improve label efficiency of active sequence labeling. Our method, SeqMix, simply augments the queried samples by generating extra labeled sequences in each iteration. The key difficulty is to generate plausibl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(20 citation statements)
references
References 39 publications
0
20
0
Order By: Relevance
“…Despite their success in text classification and sequence-to-sequence tasks, they are seldom used for sequence tagging tasks. Zhang, Yu, and Zhang (2020) use the mixup technique (Zhang et al 2018) in scope of active learning, where they augment the queries at each iteration and later classify whether the augmented query is plausible-since the resulting queries might be noisy-and report improvements for the Named Entity Recognition (NER) and event detection task. However building a robust discriminator for more challenging tasks as dependency parsing (DP) and semantic role labeling (SRL) is a challenge on its own.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Despite their success in text classification and sequence-to-sequence tasks, they are seldom used for sequence tagging tasks. Zhang, Yu, and Zhang (2020) use the mixup technique (Zhang et al 2018) in scope of active learning, where they augment the queries at each iteration and later classify whether the augmented query is plausible-since the resulting queries might be noisy-and report improvements for the Named Entity Recognition (NER) and event detection task. However building a robust discriminator for more challenging tasks as dependency parsing (DP) and semantic role labeling (SRL) is a challenge on its own.…”
Section: Related Workmentioning
confidence: 99%
“…On a "phrase-level attack", the authors first choose two subtrees and then maximize the error rate on the target subtree by modifying the tokens in the source subtree. Even though the adversarial example generation techniques (Zheng et al 2020;Han et al 2020) could be used to augment data in theory, the requirements such as a separate seq2seq generator, a BERT based scorer (Zhang et al 2020), reference parsers that are of certain quality, external POS taggers and high quality pretrained BERT (Devlin et al 2019) models, make them challenging to apply on low-resource languages. Besides, most of the aforementioned adversarial attacks are optimized to trigger an undesired change in the output with minimal modifications, while data augmentation is only concerned about increasing the generalization capacity of the model.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…[27] constructs a new token sequence by randomly selecting a token from two different sequences at each index. [28] suggests applying mixup in feature space, after an intermediary layer of a pretrained LM. New input data is then generated by reversing the synthetic feature to find the most similar token in the vocabulary.…”
Section: Data Interpolation For Regularizationmentioning
confidence: 99%
“…While these two works involve interpolating text inputs, our method differs significantly in that we do not directly generate augmented training samples; instead, we utilize mixup as a regularizing layer during the training process. Our method also does not require reversing word-embeddings or discriminative filtering using GPT-2 introduced in [28].…”
Section: Data Interpolation For Regularizationmentioning
confidence: 99%