2021
DOI: 10.1007/978-3-030-86523-8_8
|View full text |Cite
|
Sign up to set email alerts
|

Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 27 publications
0
7
0
Order By: Relevance
“…Another approach for data augmentation is through synthetic data generation using generative models [41]- [43]. In addition to these, some conditional computation approaches strive to improve the model performance with special routing techniques such as using information gain [15] or knowledge distillation from a larger model [44], [45]. Adding multi-path routing approaches to the CIGT technique can be considered in the latter group.…”
Section: Related Work a Conditional Computingmentioning
confidence: 99%
“…Another approach for data augmentation is through synthetic data generation using generative models [41]- [43]. In addition to these, some conditional computation approaches strive to improve the model performance with special routing techniques such as using information gain [15] or knowledge distillation from a larger model [44], [45]. Adding multi-path routing approaches to the CIGT technique can be considered in the latter group.…”
Section: Related Work a Conditional Computingmentioning
confidence: 99%
“…Setting 2: Pruning from scratch. In this setting, the network is trained from scratch [4,6,32,40,58,59]. During each mini-batch iteration, sub-networks in the allowable channel configuration space in Sec.…”
Section: Propertymentioning
confidence: 99%
“…5. After the training, optimized searching method is used to seek the candidate networks [4,6,58]. A recent work also incorporates the searching phase into the training phase by penalizing parameters in the rebuilt network, achieving faster convergence [32].…”
Section: Propertymentioning
confidence: 99%
“…A complexity measure can also be defined in terms of the "model size"; therefore it depends of the target ML algorithm to be optimized (e.g., the number of neurons in one layer (Juang & Hsu, 2014), the number of support vectors in a SVM (Bouraoui et al, 2018), the DNN file size (Shinozaki et al, 2020) or the ensemble size (Garrido & Hernández, 2019) for ensemble algorithms). Alternatively, the number of floating point operations (FLOPS) can be used (Chin et al, 2020;Elsken et al, 2019;Lu et al, 2020;Wang et al, 2019Wang et al, , 2020. This metric is also used to reflect the energy consumption (Han, Pool, Tran, & Dally, 2015), and used along the number of parameters in the network (Smithson et al, 2016).…”
Section: Description Referencesmentioning
confidence: 99%
“…Clearly, the number of costly function evaluations in a typical metamodel-based optimization is much lower than in a metaheuristics-based algorithm, as usually only a single new solution is evaluated in each iteration. Nevertheless, the metamodelbased algorithms by Chin et al (2020) (row 7 in Table 6) require surprisingly many evaluations since the authors dedicate additional evaluations to further improve the solutions found. The metaheuristic-based algorithm by Pathak et al (2020) (row 9 in Table 6) can be particularly expensive, as it performs a chaotic local search to generate N additional solutions for each solution present in the population of a given iteration.…”
Section: Conclusion and Research Opportunitiesmentioning
confidence: 99%