Text Classification Using Label Names Only: A Language Model Self-Training Approach

Yu, Meng; Zhang, Yunyi; Huang, Jiaxin; Xiong, Chenyan; Ji, Heng; Zhang, Chao; Han, Jiawei

doi:10.18653/v1/2020.emnlp-main.724

Cited by 158 publications

(180 citation statements)

References 37 publications

Supporting

Mentioning

179

Contrasting

Order By: Relevance

“…We also compare with LOTClass (Meng et al, 2020b), which works under the extremely weak supervision setting. In their experiments, it mostly relies on class names but has used a few keywords Table 1: An overview of our 7 benchmark datasets.…”

Section: Compared Methodsmentioning

confidence: 99%

“…Compared with previous works Meng et al, 2020b), our X-Class has a significantly more mild requirement on human-provided class names in terms of quantity and quality. We have conducted an experiment in Table 4 for X-Class on 20News and NYT-Small by deleting all but one occurrence of a class name from the input corpus.…”

Section: Requirements On Class Namesmentioning

confidence: 99%

“…Interestingly, the performance of X-Class only drops less than 1%, still outperforming all compared methods. In contrast, the most recent work, LOTClass (Meng et al, 2020b), requires a wide variety of contexts of class names from the input corpus to ensure the quality of generated class vocabulary in its very first step.…”

Section: Requirements On Class Namesmentioning

confidence: 99%

“…Weakly supervised text classification. Weakly supervised text classification has attracted much attention from researchers (Tao et al, 2018;Meng et al, 2020a;Meng et al, 2020b). The general pipeline is to generate a set of document-class pairs to train a supervised model above them.…”

Section: Related Workmentioning

confidence: 99%

“…A recent work (Meng et al, 2020b) also studied the same topic -extremely weak supervision on text classification. It follows a similar idea of (Meng et al, 2020a) and further utilizes BERT to query replacements for class names to find keywords for classes, identifying potential classes for documents via string matching.…”

Section: Related Workmentioning

confidence: 99%

See 4 more Smart Citations

X-Class: Text Classification with Extremely Weak Supervision

Wang

Mekala

Shang

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

In this paper, we explore text classification with extremely weak supervision, i.e., only relying on the surface text of class names. This is a more challenging setting than the seed-driven weak supervision, which allows a few seed words per class. We opt to attack this problem from a representation learning perspective-ideal document representations should lead to nearly the same results between clustering and the desired classification. In particular, one can classify the same corpus differently (e.g., based on topics and locations), so document representations should be adaptive to the given class names. We propose a novel framework X-Class to realize the adaptive representations. Specifically, we first estimate class representations by incrementally adding the most similar word to each class until inconsistency arises. Following a tailored mixture of class attention mechanisms, we obtain the document representation via a weighted average of contextualized word representations. With the prior of each document assigned to its nearest class, we then cluster and align the documents to classes. Finally, we pick the most confident documents from each cluster to train a text classifier. Extensive experiments demonstrate that X-Class can rival and even outperform seed-driven weakly supervised methods on 7 benchmark datasets.

show abstract

Section: Compared Methodsmentioning

confidence: 99%

Section: Requirements On Class Namesmentioning

confidence: 99%

Section: Requirements On Class Namesmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 3 more Smart Citations