2023
DOI: 10.48550/arxiv.2302.04116
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Training-free Lexical Backdoor Attacks on Language Models

Yujin Huang,
Terry Yue Zhuo,
Qiongkai Xu
et al.

Abstract: Large-scale language models have achieved tremendous success across various natural language processing (NLP) applications. Nevertheless, language models are vulnerable to backdoor attacks, which inject stealthy triggers into models for steering them to undesirable behaviors. Most existing backdoor attacks, such as data poisoning, require further (re)training or fine-tuning language models to learn the intended backdoor patterns. The additional training process however diminishes the stealthiness of the attack… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 56 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?