2018
DOI: 10.48550/arxiv.1805.02266
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Breaking NLI Systems with Sentences that Require Simple Lexical Inferences

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(17 citation statements)
references
References 0 publications
0
17
0
Order By: Relevance
“…For instance, many of such efforts have focused on only a specific task, such as sentiment analysis, question answering, etc. and left unaddressed the challenge of defending NLP models against a generic adversary optimizing in the input language for multiple tasks [40], [41], [48], [49], [102]. Moreover, most research works focus on how to develop certain types of concrete adversarial examples in a constrained adversarial setting.…”
Section: E Insights and Open Directionsmentioning
confidence: 99%
See 1 more Smart Citation
“…For instance, many of such efforts have focused on only a specific task, such as sentiment analysis, question answering, etc. and left unaddressed the challenge of defending NLP models against a generic adversary optimizing in the input language for multiple tasks [40], [41], [48], [49], [102]. Moreover, most research works focus on how to develop certain types of concrete adversarial examples in a constrained adversarial setting.…”
Section: E Insights and Open Directionsmentioning
confidence: 99%
“…The classification accuracy has been utilized by numerous research works [34], [35], [40], [41], [45], [59], [103], [105], [106]. For example, in [59], Zhang et al used the classifi-cation accuracy metric to evaluate their proposed Metropolis-Hastings Sampling Algorithm (MHA) and demonstrated that MHA under classification accuracy outperforms the baseline model on attacking capability.…”
Section: Classification Accuracymentioning
confidence: 99%
“…(Wang et al, 2019) introduces a swapping evaluation method, which means changing the distribution of words by swapping a premise with its corresponding hypothesis to test the robustness of models. Also, new test datasets are proposed, e.g., Glockner test set (Glockner et al, 2018). In the Glockner test dataset, premises are taken from the SNLI training set, and hypotheses are generated by replacing a single word in its corresponding premise sentence.…”
Section: Evaluation Of Models For Nlimentioning
confidence: 99%
“…We evaluate the proposed models on the widely used NLI datasets, i.e., SNLI and MultiNLI (MultiNLI-match and MultiNLI-mismatch). We also test our models with the Glockner testset (Glockner et al, 2018). These datasets share the same target of a 3-way prediction: determining the relation in a premise-hypothesis pair to be either entailment, neutral, or contradiction.…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation