Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing 2022
DOI: 10.18653/v1/2022.emnlp-main.587
|View full text |Cite
|
Sign up to set email alerts
|

Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

Abstract: Recent work has demonstrated that pre-trained language models (PLMs) are zero-shot learners. However, most existing zero-shot methods involve heavy human engineering or complicated self-training pipelines, hindering their application to new situations. In this work, we show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of PLMs. Specifically, we fit the unlabeled texts with a Bayesian Gaussian Mixture Model after initializing cluster positions and shapes u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(15 citation statements)
references
References 37 publications
0
15
0
Order By: Relevance
“…Using poisoned training single-cell datasets (for example, wrong data or make-up data) as an attack can test the robustness of single-cell LLMs. For model fine-tuning, instruction tuning [71] is a potential direction to explore. In this context, cells could be considered as prompts, as described in scGPT.…”
Section: Discussionmentioning
confidence: 99%
“…Using poisoned training single-cell datasets (for example, wrong data or make-up data) as an attack can test the robustness of single-cell LLMs. For model fine-tuning, instruction tuning [71] is a potential direction to explore. In this context, cells could be considered as prompts, as described in scGPT.…”
Section: Discussionmentioning
confidence: 99%
“…With better pre-training design, we may increase the scale of current models to billion level. For model fine-tuning, instruction tuning [75] is a potential direction to explore. In this context, cells could be considered as prompts, as described in scGPT.…”
Section: Discussionmentioning
confidence: 99%
“…This capability is learned through pre-training on large corpora of diverse text data. Recent advancements in the realm of LLMs are Generative Pre-trained Transformers (GPTs) [32,[64][65][66][67][68][69][70]. GPTs leverage multi-head self-attention mechanisms for parallelized processing of input data, enabling the capture of long-range dependencies to predict the next token in a sentence or a sequence of text based on the context of preceding tokens.…”
Section: Generative Large Language Modelmentioning
confidence: 99%