2021
DOI: 10.1609/aaai.v35i14.17529
|View full text |Cite
|
Sign up to set email alerts
|

Label Confusion Learning to Enhance Text Classification Models

Abstract: Representing the true label as one-hot vector is the common practice in training text classification models. However, the one-hot representation may not adequately reflect the relation between the instance and labels, as labels are often not completely independent and instances may relate to multiple labels in practice. The inadequate one-hot representations tend to train the model to be over-confident, which may result in arbitrary prediction and model overfitting, especially for confused datasets (datasets w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 38 publications
(12 citation statements)
references
References 23 publications
0
12
0
Order By: Relevance
“…In addition, we tested the accuracy of the model under different dropout rates. Figures 8,9,10,11 show that the optimal accuracy occurs when using Adam with dropout rate= 0.15. We add the dropout between the Bert Layer and the BiL-STM.…”
Section: Optimizer and Dropout Tuningmentioning
confidence: 95%
See 2 more Smart Citations
“…In addition, we tested the accuracy of the model under different dropout rates. Figures 8,9,10,11 show that the optimal accuracy occurs when using Adam with dropout rate= 0.15. We add the dropout between the Bert Layer and the BiL-STM.…”
Section: Optimizer and Dropout Tuningmentioning
confidence: 95%
“…Using hard one-hot label representation may cause model overconfidence, leading to arbitrary predictions while also making it difficult to distinguish some label confusion [5][6][7][8][9][10]. Recently, researchers have focused on soft label representation to solve the above problems, such as label smoothing, label embedding, and label distribution learning.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, such soft labels generated merely by adding noise cannot reflect the correlations between each subjectobject instance and multiple predicate categories. Label confusion (LC) method [60] is proposed in text classification task. It trains a label encoder to learn label representations and calculates the similarity between instances and labels to generate a better label distribution for training.…”
Section: Introductionmentioning
confidence: 99%
“…It trains a label encoder to learn label representations and calculates the similarity between instances and labels to generate a better label distribution for training. Among these solutions, [60] is the closest to our approach, but we do not need to design additional network encoders to capture correlations between instances and multiple predicates. In addition, some semi-supervised SGG methods also assign soft labels to unannotated samples [61], [62].…”
Section: Introductionmentioning
confidence: 99%