2018
DOI: 10.48550/arxiv.1811.10959
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Dataset Distillation

Abstract: Model distillation aims to distill the knowledge of a complex model into a simpler one. In this paper, we consider an alternative formulation called dataset distillation: we keep the model fixed and instead attempt to distill the knowledge from a large training dataset into a small one. The idea is to synthesize a small number of data points that do not need to come from the correct data distribution, but will, when given to the learning algorithm as training data, approximate the model trained on the original… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
227
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 108 publications
(231 citation statements)
references
References 17 publications
4
227
0
Order By: Relevance
“…In such a case, the proposed method limits the number of trajectories to be imitated by excluding some of the various trajectories as noise, and thus it can be trained by a standard implementation. This can be interpreted as the dataset distillation [43] in the loss function stage implicitly.…”
Section: Other Applications Of the Proposed Methodsmentioning
confidence: 99%
“…In such a case, the proposed method limits the number of trajectories to be imitated by excluding some of the various trajectories as noise, and thus it can be trained by a standard implementation. This can be interpreted as the dataset distillation [43] in the loss function stage implicitly.…”
Section: Other Applications Of the Proposed Methodsmentioning
confidence: 99%
“…Also, unlike the work [13,40] which violates the rule that clients should never share data to other clients or the server, zero-shot data augmentation synthesize data based on the model information only. Note that using synthesized samples for data augmentation differs from related works like [9], which take an approach similar to dataset distillation [32] to synthesize data for the purpose of compressing model updates for communication efficiency purposes.…”
Section: Zero-shot Data Augmentationmentioning
confidence: 99%
“…Even though, it is non-trivial to develop a suitable framework and to design an appropriate training objective that can effectively update the distilled data towards the original data. Inspired by the previous works [1,18], we made efforts to conduct a function where the input is the distilled data and output is a well-trained model, so that the gradients can be backpropagated to the distilled data via this function when using the original data to train the model.…”
Section: Approachmentioning
confidence: 99%