Data Augmentation using Pre-trained Transformer Models

Kumar, Varun; Choudhary, Ashutosh; Cho, Eunah

doi:10.48550/arxiv.2003.02245

Cited by 78 publications

(89 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…While data augmentation (DA) has been widely adopted in computer vision (Shorten & Khoshgoftaar, 2019), DA for language tasks is less straightforward. Recently, generative language models have been used to synthesize examples for various NLP tasks (Kumar et al, 2020;Anaby-Tavor et al, 2020;Puri et al, 2020;Yang et al, 2020). Different from these methods which focus on the low-resource language-only tasks, our method demonstrates the advantage of synthetic captions in large-scale vision-language pre-training.…”

Section: Data Augmentationmentioning

confidence: 99%

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Li¹,

Li²,

Xiong³

et al. 2022

Preprint

139

View full text Add to dashboard Cite

Vision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision. In this paper, we propose BLIP, a new VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP effectively utilizes the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones. We achieve state-of-the-art results on a wide range of vision-language tasks, such as image-text retrieval (+2.7% in average recall@1), image captioning (+2.8% in CIDEr), and VQA (+1.6% in VQA score). BLIP also demonstrates strong generalization ability when directly transferred to videolanguage tasks in a zero-shot manner. Code, models, and datasets are released.

show abstract

Section: Data Augmentationmentioning

confidence: 99%

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Li¹,

Li²,

Xiong³

et al. 2022

Preprint

139

View full text Add to dashboard Cite

show abstract

“…One technique for obtaining an abundance of examples uses recent Natural Language Generation (NLG) models ( §7.1). It has been shown in recent papers (Wei and Zou, 2019;Anaby-Tavor et al, 2019;Kumar et al, 2020;Amin-Nejad et al, 2020;Russo et al, 2020) that generating abundance of training examples can improve classifier performance. We aim to check whether this can improve our syntactic search method as well.…”

Section: Arxiv:210205007v1 [Cscl] 9 Feb 2021mentioning

confidence: 99%

Bootstrapping Relation Extractors using Syntactic Search by Examples

Eyal¹,

Amrami²,

Taub-Tabib³

et al. 2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

View full text Add to dashboard Cite

The advent of neural-networks in NLP brought with it substantial improvements in supervised relation extraction. However, obtaining a sufficient quantity of training data remains a key challenge. In this work we propose a process for bootstrapping training datasets which can be performed quickly by non-NLP-experts. We take advantage of search engines over syntactic-graphs (Such as Shlain et al. ( 2020)) which expose a friendly by-example syntax. We use these to obtain positive examples by searching for sentences that are syntactically similar to user input examples. We apply this technique to relations from TACRED and Do-cRED and show that the resulting models are competitive with models trained on manually annotated data and on data obtained from distant supervision. The models also outperform models trained using NLG data augmentation techniques. Extending the search-based approach with the NLG method further improves the results.

show abstract

“…For textual data, Zhang et al (2015); Wei & Zou (2019) and Wang (2015) respectively use lexical substitution based on the embedding space. Jiao et al (2019); Cheng et al (2019); Kumar et al (2020) generate augmented samples with a pre-trained language model. Some other techniques like back translation , random noise injection (Xie et al, 2017) and data mixup (Guo et al, 2019; are also proven to be useful.…”

Section: Related Workmentioning

confidence: 99%

“…Various methods have been proposed to generate augmented samples for textual data. Recently, large-scale pre-trained language models like BERT (Devlin et al, 2019) and GPT-2 (Radford et al, 2019) learn contextualized representations and have been used widely in generating high-quality augmented sentences (Jiao et al, 2019;Kumar et al, 2020). In this paper, we use a pre-trained BERT trained from masked language modeling to generate augmented samples.…”

Section: Example: Mmel Implementation On Natural Language Understandi...mentioning

confidence: 99%

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Yi,

Hou,

Shang

et al. 2021

Preprint

View full text Add to dashboard Cite

Data augmentation is an effective technique to improve the generalization of deep neural networks. However, previous data augmentation methods usually treat the augmented samples equally without considering their individual impacts on the model. To address this, for the augmented samples from the same training example, we propose to assign different weights to them. We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples. Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i.e., harder examples). Minimizing this maximal expected loss enables the model to perform well under any reweighting strategy. The proposed method can generally be applied on top of any data augmentation methods. Experiments are conducted on both natural language understanding tasks with token-level data augmentation, and image classification tasks with commonly-used image augmentation techniques like random crop and horizontal flip. Empirical results show that the proposed method improves the generalization performance of the model.

show abstract

Data Augmentation using Pre-trained Transformer Models

Cited by 78 publications

References 18 publications

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Bootstrapping Relation Extractors using Syntactic Search by Examples

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Contact Info

Product

Resources

About