2021
DOI: 10.48550/arxiv.2111.09833
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TransMix: Attend to Mix for Vision Transformers

Abstract: Mixup-based augmentation has been found to be effective for generalizing models during training, especially for Vision Transformers (ViTs) since they can easily overfit. However, previous mixup-based methods have an underlying prior knowledge that the linearly interpolated ratio of targets should be kept the same as the ratio proposed in input interpolation. This may lead to a strange phenomenon that sometimes there is no valid object in the mixed image due to the random process in augmentation but there is st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 35 publications
0
2
0
Order By: Relevance
“…Attentive CutMix [53] and SnapMix [30] take the activation map of the correct label as a confidence indicator for selecting the semantically meaningful mix regions. TransMix [5] requires the attention map of self-attention module to re-weight the targets. Manifold Mixup [52], PatchUp [17] and MoEx [36] perform a feature-level interpolation over two samples to prevent overfitting the intermediate representations.…”
Section: Related Workmentioning
confidence: 99%
“…Attentive CutMix [53] and SnapMix [30] take the activation map of the correct label as a confidence indicator for selecting the semantically meaningful mix regions. TransMix [5] requires the attention map of self-attention module to re-weight the targets. Manifold Mixup [52], PatchUp [17] and MoEx [36] perform a feature-level interpolation over two samples to prevent overfitting the intermediate representations.…”
Section: Related Workmentioning
confidence: 99%
“…Manifold Mixup [53], Cutmix [21], etc. ), as data augmentation methods, have not only achieved notable success in a wide range of machine learning problems such as supervised learning [8], semi-supervised learning [54,55], adversarial learning [56], but also adapted to different data forms such as images [57], texts [58,59], graphs [60], and speech [61]. Notably, to alleviate the problem of class imbalance in the dataset, a series of methods [9,10,62] employ Mixup to augment the data.…”
Section: Mixupmentioning
confidence: 99%