Proceedings of the 5th Workshop on Representation Learning for NLP 2020
DOI: 10.18653/v1/2020.repl4nlp-1.8
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Training for Commonsense Inference

Abstract: We propose an AdversariaL training algorithm for commonsense InferenCE (ALICE). We apply small perturbations to word embeddings and minimize the resultant adversarial risk to regularize the model. We exploit a novel combination of two different approaches to estimate these perturbations: 1) using the true label and 2) using the model prediction. Without relying on any human-crafted features, knowledge bases or additional datasets other than the target datasets, our model boosts the finetuning performance of Ro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 19 publications
(16 citation statements)
references
References 20 publications
0
15
0
1
Order By: Relevance
“…The dropout is a widely-used approach in deep learning to improve model generalization (Srivastava et al, 2014). For adversarial learning methods, the main theme is reducing the model sensitivity toward small input perturbations (Goodfellow et al, 2014;Madry et al, 2018), which has been recently applied to both fine-turning (Jiang et al, 2020;Pereira et al, 2020;Zhu et al, 2020;Li and Qiu, 2020) and pre-training . However, models trained with adversarial learning are found to have at-odd generalization (Tsipras et al, 2019;Zhang et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…The dropout is a widely-used approach in deep learning to improve model generalization (Srivastava et al, 2014). For adversarial learning methods, the main theme is reducing the model sensitivity toward small input perturbations (Goodfellow et al, 2014;Madry et al, 2018), which has been recently applied to both fine-turning (Jiang et al, 2020;Pereira et al, 2020;Zhu et al, 2020;Li and Qiu, 2020) and pre-training . However, models trained with adversarial learning are found to have at-odd generalization (Tsipras et al, 2019;Zhang et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Subtask 1 and Subtask 2). Adversarial training (ADV): Adversarial training has proven effective in improving model generalization and robustness in computer vision (Madry et al, 2017;Goodfellow et al, 2014) and more recently in NLP (Zhu et al, 2019;Liu et al, 2020a;Pereira et al, 2020). It works by augmenting the input with a small perturbation that maximizes the adversarial loss:…”
Section: Training Proceduresmentioning
confidence: 99%
“…where the inner maximization can be solved by projected gradient descent (Madry et al, 2017). Recently, adversarial training has been successfully applied to NLP as well (Zhu et al, 2019;Pereira et al, 2020). In our experiments, we use SMART , which instead regularizes the standard training objective using virtual adversarial training (Miyato et al, 2018):…”
Section: Training Proceduresmentioning
confidence: 99%
“…In the field of NLP, Miyato et al (2017) have applied adversarial training on text classification tasks and improved the model performance. From then on, many AT methods has been proposed (Wu et al, 2017;Yasunaga et al, 2018;Bekoulis et al, 2018;Zhu et al, 2020;Jiang et al, 2019;Pereira et al, 2020;. They mostly adopt a general AT strategy, but focus less on the adaptation of AT to NLP tasks.…”
Section: Introductionmentioning
confidence: 99%