2018
DOI: 10.48550/arxiv.1802.03268
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Efficient Neural Architecture Search via Parameter Sharing

Abstract: We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. In ENAS, a controller discovers neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on a validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among ch… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
591
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 282 publications
(593 citation statements)
references
References 11 publications
1
591
1
Order By: Relevance
“…For example, searching an image model for CIFAR-10 and ImageNet required 2000 GPU days of reinforcement learning (RL) [46] or 3150 GPU days of evolution [28]. ENAS [24] introduced a parameter-sharing strategy to reduce the search time. Recent differentiable NAS (DNAS) methods [20] introduced the softmax-based continuous relaxation of the architecture representation, allowing efficient search using gradient descent.…”
Section: Neural Architecture Search Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, searching an image model for CIFAR-10 and ImageNet required 2000 GPU days of reinforcement learning (RL) [46] or 3150 GPU days of evolution [28]. ENAS [24] introduced a parameter-sharing strategy to reduce the search time. Recent differentiable NAS (DNAS) methods [20] introduced the softmax-based continuous relaxation of the architecture representation, allowing efficient search using gradient descent.…”
Section: Neural Architecture Search Methodsmentioning
confidence: 99%
“…Recently, machine-designed architectures by Neural Architecture Search (NAS) have surpassed the humandesigned ones for image recognition [28,24,33,44]. For video action recognition, the latest work X3D [9] placed a new milestone in this line: it progressively expanded a hand-crafted 2D architecture into 3D spatial-temporal ones, by expanding along multiple axes, including space, time, width, and depth.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we seek to answer following questions in the context of iterative structured pruning with rewinding: (Han, Mao, and Dally 2015;Kadetotad et al 2016), knowledge distillation (Polino, Pascanu, and Alistarh 2018;Yim et al 2017), neural architecture search (Zoph and Le 2016;Pham et al 2018) and pruning (Li et al 2016;Han, Mao, and Dally 2015;Srinivas and Babu 2015;Molchanov et al 2016). There has also been substantial work in manually designing new model topology, like Mo-bileNet (Howard et al 2017) and EfficientNet (Tan and Le 2019), that are suitable for edge device deployment but are less accurate compared to traditional models like ResNet (He et al 2016).…”
Section: Our Solutionmentioning
confidence: 99%
“…As deep learning becomes pervasive and moves towards edge devices, DNN deployment becomes harder because of the mistmatch between resource-hungry DNNs and resourceconstrained edge devices (Li, Zhou, and Chen 2018;Li et al 2019). Deep learning researchers and practitioners have proposed many techniques to alleviate this resource pressure (Chu, Funderlic, and Plemmons 2003;Han, Mao, and Dally 2015;Polino, Pascanu, and Alistarh 2018;Yim et al 2017;Pham et al 2018). Among these efforts, DNN pruning is a promising approach (Li et al 2016;Han, Mao, and Dally 2015;Molchanov et al 2016;Theis et al 2018;Renda, Frankle, and Carbin 2020), which identifies the parameters (or weight elements) that do not contribute significantly to the accuracy, and prunes them from the network.…”
Section: Introductionmentioning
confidence: 99%
“…To conduct NAS, it normally requires a search space, a search algorithm and a set of training data. Current NAS research mainly focuses on improving the search algorithms [20,34,1,35], designing the search space [21,7,30], reducing the search cost [3,11,19,4,15] and integrating direct metrics with the search process [27,8].…”
Section: Related Workmentioning
confidence: 99%