2023
DOI: 10.48550/arxiv.2302.02023
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TextShield: Beyond Successfully Detecting Adversarial Sentences in Text Classification

Abstract: Adversarial attack serves as a major challenge for neural network models in NLP, which precludes the model's deployment in safety-critical applications. A recent line of work, detection-based defense, aims to distinguish adversarial sentences from benign ones. However, the core limitation of previous detection methods is being incapable of giving correct predictions on adversarial sentences unlike defense methods from other paradigms. To solve this issue, this paper proposes TextShield: (1) we discover a link … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 38 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?