Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2022
DOI: 10.18653/v1/2022.acl-short.67
|View full text |Cite
|
Sign up to set email alerts
|

DMix: Adaptive Distance-aware Interpolative Mixup

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…Specifically, they (usually) randomly select samples to mix and do not consider the model's learning ability. Some works (Sawhney et al, 2022;Park and Caragea, 2022) have also focused on addressing this issue and proposed various methods for effectively choosing samples. Sawhney et al (2022) select samples according to the embedding similarity.…”
Section: Data Augmentation In Nlpmentioning
confidence: 99%
See 2 more Smart Citations
“…Specifically, they (usually) randomly select samples to mix and do not consider the model's learning ability. Some works (Sawhney et al, 2022;Park and Caragea, 2022) have also focused on addressing this issue and proposed various methods for effectively choosing samples. Sawhney et al (2022) select samples according to the embedding similarity.…”
Section: Data Augmentation In Nlpmentioning
confidence: 99%
“…Some works (Sawhney et al, 2022;Park and Caragea, 2022) have also focused on addressing this issue and proposed various methods for effectively choosing samples. Sawhney et al (2022) select samples according to the embedding similarity. Park and Caragea (2022) merge one sample considering the confidence of the model's predictions.…”
Section: Data Augmentation In Nlpmentioning
confidence: 99%
See 1 more Smart Citation
“…In Mixup-Transformer [16], Mixup was applied to the output of the last layer of a Transformer-based model. DMix [17] believes that fusing two samples extracted using a certain strategy will yield better results than random sampling. Therefore, they construct a set of samples whose hyperbolic distance from a sample in the dataset exceeds a set threshold, and randomly extract a sample from the set for Mixup operation.…”
Section: Interpolation-based Data Augmentationmentioning
confidence: 99%
“…For example, the 9th layer of Bert is more focused on semantic representation, while the 3rd layer is more focused on the length of the sentence. Some previous work [4,17,18] on Mixup has also studied the selection of which hidden layer output in the Bert model to use as the object of Mixup, based on the findings of literature [25].…”
Section: Selecting Hidden Layersmentioning
confidence: 99%