2019
DOI: 10.1007/s11432-018-9944-x
|View full text |Cite
|
Sign up to set email alerts
|

Snapshot boosting: a fast ensemble framework for deep neural networks

Abstract: Boosting has been proven to be effective in improving the generalization of machine learning models in many fields. It is capable of getting high-diversity base learners and getting an accurate ensemble model by combining a sufficient number of weak learners. However, it is rarely used in deep learning due to the high training budget of the neural network. Another method named snapshot ensemble can significantly reduce the training budget, but it is hard to balance the tradeoff between training costs and diver… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 43 publications
(14 citation statements)
references
References 22 publications
0
14
0
Order By: Relevance
“…The core of snapshot ensemble is an optimization process, which visits multiple local minima before converging to the final minimum. Saving the model parameters derived from these different local minima is equivalent to taking a snapshot of the model at these different local minima [24,25].…”
Section: Principles Of Snapshot Ensemblementioning
confidence: 99%
“…The core of snapshot ensemble is an optimization process, which visits multiple local minima before converging to the final minimum. Saving the model parameters derived from these different local minima is equivalent to taking a snapshot of the model at these different local minima [24,25].…”
Section: Principles Of Snapshot Ensemblementioning
confidence: 99%
“…Such methods are sometimes seen as regularization approaches and can work in coordination with our proposed method. Recently, checkpoint ensemble has become increasingly popular as it improves the predictors "for free" [Huang et al, 2017a, Zhang et al, 2020. It was termed as "Horizontal Voting" in [Xie et al, 2013], where the outputs of the checkpoints are straightforwardly ensembled as the final prediction.…”
Section: Related Workmentioning
confidence: 99%
“…Similarly, another method FGE (Fast Geometric Ensembling) [Garipov et al, 2018] copies a trained model and further fine-tunes it with a cyclical learning rate, saving checkpoints and ensembling them with the trained model. More recently, [Zhang et al, 2020] proposed the Snapshot Boosting, where they modified the learning rate restarting rules and set different sample weights during each training stage to further enhance the diversity of checkpoints. Although there are weights of training samples in Snapshot Boosting, these weights are updated only after the learning rate is restarted and each update begins from the initialization.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Meanwhile, to improve self-supervising quality in the distilled knowledge, we propose to generate a more powerful teacher by ensembling multi-scale knowledge from students. The diversity, which is a key factor in constructing an effective ensemble teacher [33,34], is enhanced by optimizing individual reception-aware graph knowledge for each student. For better selfsupervising quantity, each student can get sufficient supervision from the reception-aware graph knowledge, task-specific knowledge, and the rich distilled knowledge (soft label) from a powerful teacher model.…”
Section: Introductionmentioning
confidence: 99%