Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-3005
|View full text |Cite
|
Sign up to set email alerts
|

Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling

Abstract: In this paper, we address the problem of effectively self-training neural networks in a lowresource setting. Self-training is frequently used to automatically increase the amount of training data. However, in a low-resource scenario, it is less effective due to unreliable annotations created using self-labeling of unlabeled data. We propose to combine self-training with noise handling on the self-labeled data. Directly estimating noise on the combined clean training set and self-labeled data can lead to corrup… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
4
1

Relationship

4
5

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 19 publications
0
12
0
Order By: Relevance
“…To avoid this, it can be combined with label noise handling techniques. This pipeline has been shown to be effective for several NLP tasks (Lange et al, 2019;Paul et al, 2019;Wang et al, 2019;Chen et al, 2019;Mayhew et al, 2019), however, mostly for RNN based approaches. As we have seen in Section 4 that these have a lower baseline performance, we are interested in whether distant supervision is still useful for the better performing transformer models.…”
Section: Distant Supervisionmentioning
confidence: 99%
“…To avoid this, it can be combined with label noise handling techniques. This pipeline has been shown to be effective for several NLP tasks (Lange et al, 2019;Paul et al, 2019;Wang et al, 2019;Chen et al, 2019;Mayhew et al, 2019), however, mostly for RNN based approaches. As we have seen in Section 4 that these have a lower baseline performance, we are interested in whether distant supervision is still useful for the better performing transformer models.…”
Section: Distant Supervisionmentioning
confidence: 99%
“…The noise in the labels can also be modeled. A common model is a confusion matrix estimating the relationship between clean and noisy labels (Fang and Cohn, 2016;Luo et al, 2017;Hedderich and Klakow, 2018;Paul et al, 2019;Lange et al, 2019a,c;Wang et al, 2019;Hedderich et al, 2021b). The classifier is no longer trained directly on the noisily-labeled data.…”
Section: Learning With Noisy Labelsmentioning
confidence: 99%
“…Journal of Data and Information Science the state-of-the-art one-deep learning-based (Paul et al, 2019;Yang et al, 2018), respectively.…”
Section: Guest Editorialmentioning
confidence: 99%