Label Confusion Learning to Enhance Text Classification Models

Biyang, Guo,; Han, Songqiao; Han, Xiao; Huang, Hailiang; Lu, Ting

doi:10.1609/aaai.v35i14.17529

Cited by 38 publications

(12 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, we tested the accuracy of the model under different dropout rates. Figures 8,9,10,11 show that the optimal accuracy occurs when using Adam with dropout rate= 0.15. We add the dropout between the Bert Layer and the BiL-STM.…”

Section: Optimizer and Dropout Tuningmentioning

confidence: 95%

“…Using hard one-hot label representation may cause model overconfidence, leading to arbitrary predictions while also making it difficult to distinguish some label confusion [5][6][7][8][9][10]. Recently, researchers have focused on soft label representation to solve the above problems, such as label smoothing, label embedding, and label distribution learning.…”

Section: Related Workmentioning

confidence: 99%

“…This is because we assume a binary relationship between an instance and the labels and assume that the labels are independent of each other. However, labels are rarely completely independent, and instances can be associated with multiple labels [8][9][10][11]. Therefore, the one-hot vector representation is not sufficient to adequately describe the relationship between the instance and the labels.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Concept-Based Label Distribution Learning for Text Classification

Li¹,

Huang²,

Li³

et al. 2022

Int J Comput Intell Syst

View full text Add to dashboard Cite

Text classification is a crucial task in data mining and artificial intelligence. In recent years, deep learning-based text classification methods have made great development. The deep learning methods supervise model training by representing a label as a one-hot vector. However, the one-hot label representation cannot adequately reflect the relation between an instance and the labels, as labels are often not completely independent, and the instance may be associated with multiple labels in practice. Simply representing the labels as one-hot vectors leads to overconfidence in the model, making it difficult to distinguish some label confusions. In this paper, we propose a simulated label distribution method based on concepts (SLDC) to tackle this problem. This method captures the overlap between the labels by computing the similarity between an instance and the labels and generates a new simulated label distribution for assisting model training. In particular, we incorporate conceptual information from the knowledge base into the representation of instances and labels to address the surface mismatching problem when instances and labels are compared for similarity. Moreover, to fully use the simulated label distribution and the original label vector, we set up a multi-loss function to supervise the training process. Expensive experiments demonstrate the effectiveness of SLDC on five complex text classification datasets. Further experiments also verify that SLDC is especially helpful for confused datasets.

show abstract

Section: Optimizer and Dropout Tuningmentioning

confidence: 95%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Concept-Based Label Distribution Learning for Text Classification

Li¹,

Huang²,

Li³

et al. 2022

Int J Comput Intell Syst

View full text Add to dashboard Cite

show abstract

“…However, such soft labels generated merely by adding noise cannot reflect the correlations between each subjectobject instance and multiple predicate categories. Label confusion (LC) method [60] is proposed in text classification task. It trains a label encoder to learn label representations and calculates the similarity between instances and labels to generate a better label distribution for training.…”

Section: Introductionmentioning

confidence: 99%

“…It trains a label encoder to learn label representations and calculates the similarity between instances and labels to generate a better label distribution for training. Among these solutions, [60] is the closest to our approach, but we do not need to design additional network encoders to capture correlations between instances and multiple predicates. In addition, some semi-supervised SGG methods also assign soft labels to unannotated samples [61], [62].…”

Section: Introductionmentioning

confidence: 99%

Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation

Li¹,

Chen²,

Shi³

et al. 2022

Preprint

View full text Add to dashboard Cite

The Scene Graph Generation (SGG) task aims to detect all the objects and their pairwise visual relationships in a given image. Although SGG has achieved remarkable progress over the last few years, almost all existing SGG models follow the same training paradigm: they treat both object and predicate classification in SGG as a single-label classification problem, and the ground-truths are one-hot target labels. However, this prevalent training paradigm has overlooked two characteristics of current SGG datasets: 1) For positive samples, some specific subject-object instances may have multiple reasonable predicates. 2) For negative samples, there are numerous missing annotations. Regardless of the two characteristics, SGG models are easy to be confused and make wrong predictions. To this end, we propose a novel model-agnostic Label Semantic Knowledge Distillation (LS-KD) for unbiased SGG. Specifically, LS-KD dynamically generates a "soft" label for each subject-object instance by fusing a predicted Label Semantic Distribution (LSD) with its original one-hot target label. LSD reflects the correlations between this instance and multiple predicate categories. Meanwhile, we propose two different strategies to predict LSD: iterative self-KD and synchronous self-KD. Extensive ablations and results on three SGG tasks have attested to the superiority and generality of our proposed LS-KD, which can consistently achieve decent trade-off performance between different predicate categories.

show abstract

FHTC: Few-Shot Hierarchical Text Classification in Financial Domain

Wang

Chen

2021

Neural Information Processing

View full text Add to dashboard Cite

Label Confusion Learning to Enhance Text Classification Models

Cited by 38 publications

References 23 publications

Concept-Based Label Distribution Learning for Text Classification

Concept-Based Label Distribution Learning for Text Classification

Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation

FHTC: Few-Shot Hierarchical Text Classification in Financial Domain

Contact Info

Product

Resources

About