ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413928
|View full text |Cite
|
Sign up to set email alerts
|

SapAugment: Learning A Sample Adaptive Policy for Data Augmentation

Abstract: Data augmentation methods usually apply the same augmentation (or a mix of them) to all the training samples. For example, to perturb data with noise, the noise is sampled from a Normal distribution with a fixed standard deviation, for all samples. We hypothesize that a hard sample with high training loss already provides strong training signal to update the model parameters and should be perturbed with mild or no augmentation. Perturbing a hard sample with a strong augmentation may also make it too hard to le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 14 publications
0
12
0
Order By: Relevance
“…The results achieved with DeepSpectrumLite (described in Section 3.3) on all eight tasks show the system's efficacy, in particular, compared to the traditional Deep Spectrum feature extraction. Furthermore, the applied state-of-the-art CutMix (Yun et al, 2019 ) and SpecAugment (Park et al, 2019 ) techniques in combination with an adapted version of the SapAugment (Hu et al, 2021 ) policy proved themselves to be useful for all datasets, especially for the smaller ones. For the IEMOCAP dataset, our best performing model achieves comparable results with the recently published EmoNet paper which uses the same partitioning strategy (Gerczuk et al, 2021 ).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The results achieved with DeepSpectrumLite (described in Section 3.3) on all eight tasks show the system's efficacy, in particular, compared to the traditional Deep Spectrum feature extraction. Furthermore, the applied state-of-the-art CutMix (Yun et al, 2019 ) and SpecAugment (Park et al, 2019 ) techniques in combination with an adapted version of the SapAugment (Hu et al, 2021 ) policy proved themselves to be useful for all datasets, especially for the smaller ones. For the IEMOCAP dataset, our best performing model achieves comparable results with the recently published EmoNet paper which uses the same partitioning strategy (Gerczuk et al, 2021 ).…”
Section: Discussionmentioning
confidence: 99%
“…DeepSpectrumLite has implemented an adapted version of the SapAugment data augmentation policy (Hu et al, 2021 ). The policy decides for every training sample its portion of applied data augmentation.…”
Section: Proposed Systemmentioning
confidence: 99%
“…An alternative to automatically learning data transformation operations directly from data (e.g., as in [86], [90], [93]) is to compose simple and flexible transformation functions whose properties can be modified in the process of training. In contrast to approaches such as AA [17], FastAA [78] and FasterAA [27] that compose a search space consisting of fixed augmentations and evaluate different combinations of these to obtain effective policies, many recent approaches (e.g., [34], [98], [99], [100]) propose to learn augmentation policies in a more flexible and dynamic way. They design the search space as a composition of loosely-specified primitive data transformation operations that can be modified in the course of training to generate better augmentations.…”
Section: Learning Dynamic Augmentations From Basic Transformation Ope...mentioning
confidence: 99%
“…Besides, the on-the-fly augmentation method called Region-Level Spectrogram Augmentation was recently proposed for speech recognition, and speaker recognition, such as SpecAugment [5,6] To address these two issues, we propose a novel on-the-fly augmentation strategy called GuidedMix for speaker recognition. Inspired by Cutmix, which was applied in computer vision and speech recognition [7,8], GuidedMix replaces the masking region with a patch from another spectrum, making all points on training features informative which can alleviate the first problem of SpecAugment. However, randomly patching introduces more ambiguity than randomly masking as forcing the model to correlate meaningless patch to specific speaker label.…”
Section: Introductionmentioning
confidence: 99%