Zero-shot Text Classification via Reinforced Self-training

Ye, Zhiquan; Geng, Yuxia; Chen, Jiaoyan; Chen, Jingmin; Xu, Xiaoyong; Zheng, Suhang; Wang, Feng; Zhang, Jun; Chen, Huajun

doi:10.18653/v1/2020.acl-main.272

Cited by 57 publications

(33 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other approaches for few-shot learning in NLP include exploiting examples from related tasks (Yu et al, 2018;Gu et al, 2018;Dou et al, 2019;Qian and Yu, 2019;Yin et al, 2019) and using data augmentation (Xie et al, 2020;; the latter commonly relies on back-translation (Sennrich et al, 2016), requiring large amounts of parallel data. Approaches using textual class descriptors typically assume that abundant examples are available for a subset of classes (e.g., Romera-Paredes and Torr, 2015;Veeranna et al, 2016;Ye et al, 2020). In contrast, our approach requires no additional labeled data and provides an intuitive interface to leverage task-specific human knowledge.…”

Section: Introductionmentioning

confidence: 99%

Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference

Schick¹,

Schütze²

2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

718

422

View full text Add to dashboard Cite

Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with "task descriptions" in natural language (e.g., Radford et al., 2019). While this approach underperforms its supervised counterpart, we show in this work that the two ideas can be combined: We introduce Pattern-Exploiting Training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task. These phrases are then used to assign soft labels to a large set of unlabeled examples. Finally, standard supervised training is performed on the resulting training set. For several tasks and languages, PET outperforms supervised training and strong semi-supervised approaches in lowresource settings by a large margin. 1

show abstract

Section: Introductionmentioning

confidence: 99%

Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference

Schick¹,

Schütze²

2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

718

422

View full text Add to dashboard Cite

show abstract

“…A significant challenge for real-life development of MTC applications is severe deficiencies of annotated data for each label in the hierarchy, which demands better solutions for zero-shot learning. The existing zero-shot learning for multi-label text classification (ZS-MTC) mostly learns a matching model between the feature space of text and the label space (Ye et al, 2020). In order to learn effective representations for labels, a majority of existing work incorporates label hierarchies via a label encoder designed as Graph Neural Networks (GNNs) that can aggregate the neighboring information for labels (Chalkidis et al, 2020;Lu et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning

Liu¹,

Zhang²,

Yin³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Exploiting label hierarchies has become a promising approach to tackling the zero-shot multi-label text classification (ZS-MTC) problem. Conventional methods aim to learn a matching model between text and labels, using a graph encoder to incorporate label hierarchies to obtain effective label representations (Rios and Kavuluru, 2018). More recently, pretrained models like BERT (Devlin et al., 2018) have been used to convert classification tasks into a textual entailment task (Yin et al., 2019). This approach is naturally suitable for the ZS-MTC task. However, pretrained models are underexplored in the existing work because they do not generate individual vector representations for text or labels, making it unintuitive to combine them with conventional graph encoding methods. In this paper, we explore to improve pretrained models with label hierarchies on the ZS-MTC task. We propose a Reinforced Label Hierarchy Reasoning (RLHR) approach to encourage interdependence among labels in the hierarchies during training. Meanwhile, to overcome the weakness of flat predictions, we design a rollback algorithm that can remove logical errors from predictions during inference. Experimental results on three reallife datasets show that our approach achieves better performance and outperforms previous non-pretrained methods on the ZS-MTC task.

show abstract

“…Other approaches to few-shot learning in NLP commonly require large sets of examples from related tasks (Gu et al, 2018;Dou et al, 2019;Qian and Yu, 2019;Ye et al, 2020), parallel data for consistency training (Xie et al, 2020;, or highly specialized methods tailored towards a specific task (Laban et al, 2020). In contrast, GENPET requires no additional labeled data and provides an intuitive interface to leveraging task-specific human knowledge.…”

Section: Related Workmentioning

confidence: 99%

Few-Shot Text Generation with Natural Language Instructions

Schick¹,

Schütze²

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Providing pretrained language models with simple task descriptions in natural language enables them to solve some tasks in a fully unsupervised fashion. Moreover, when combined with regular learning from examples, this idea yields impressive few-shot results for a wide range of text classification tasks. It is also a promising direction to improve data efficiency in generative settings, but there are several challenges to using a combination of task descriptions and example-based learning for text generation. In particular, it is crucial to find task descriptions that are easy to understand for the pretrained model and to ensure that it actually makes good use of them; furthermore, effective measures against overfitting have to be implemented. In this paper, we show how these challenges can be tackled: We introduce GENPET, a method for text generation that is based on pattern-exploiting training, a recent approach for combining textual instructions with supervised learning that only works for classification tasks. On several summarization and headline generation datasets, GENPET gives consistent improvements over strong baselines in few-shot settings. 1

show abstract

Zero-shot Text Classification via Reinforced Self-training

Cited by 57 publications

References 31 publications

Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference

Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference

Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning

Few-Shot Text Generation with Natural Language Instructions

Contact Info

Product

Resources

About