Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks

Véniat, Tom; Denoyer, Ludovic

doi:10.1109/cvpr.2018.00368

Cited by 75 publications

(64 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Macro search algorithms aim to directly discover the entire neural networks [5,4,38,46,21]. To search convolutional neural networks (CNNs) [20], typical approaches apply RL to optimize the searching policy to discover architectures [1,5,46,31].…”

Section: Related Workmentioning

confidence: 99%

Searching for a Robust Neural Architecture in Four GPU Hours

Dong

Yang

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

577

519

View full text Add to dashboard Cite

Conventional neural architecture search (NAS) approaches are based on reinforcement learning or evolutionary strategy, which take more than 3000 GPU hours to find a good model on CIFAR-10. We propose an efficient NAS approach learning to search by gradient descent. Our approach represents the search space as a directed acyclic graph (DAG). This DAG contains billions of sub-graphs, each of which indicates a kind of neural architecture. To avoid traversing all the possibilities of the sub-graphs, we develop a differentiable sampler over the DAG. This sampler is learnable and optimized by the validation loss after training the sampled architecture. In this way, our approach can be trained in an end-to-end fashion by gradient descent, named Gradient-based search using Differentiable Architecture Sampler (GDAS). In experiments, we can finish one searching procedure in four GPU hours on CIFAR-10, and the discovered model obtains a test error of 2.82% with only 2.5M parameters, which is on par with the state-of-the-art. Code is publicly available on GitHub: https://github.com/D-X-Y/NAS-Projects.

show abstract

Section: Related Workmentioning

confidence: 99%

Searching for a Robust Neural Architecture in Four GPU Hours

Dong

Yang

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

577

519

View full text Add to dashboard Cite

show abstract

“…As discussed, our model allows to dynamically dynamically set the trade-off between accuracy and inference time with no additional cost. ferent state-of-the-art CNN acceleration strategies [17,19,22,38,56]. We consider methods applying pruning at different levels, such as independent filters (Network slimming [38]), groups of weights (CondenseNet) [19], connections in multi-branch architectures (SuperNet) [56], or a combination of them (SSS) [22].…”

Section: Comparison With the State Of The Artmentioning

confidence: 99%

Adaptative Inference Cost With Convolutional Neural Mixture Models

Ruiz

Verbeek

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Despite the outstanding performance of convolutional neural networks (CNNs) for many vision tasks, the required computational cost during inference is problematic when resources are limited. In this context, we propose Convolutional Neural Mixture Models (CNMMs), a probabilistic model embedding a large number of CNNs that can be jointly trained and evaluated in an efficient manner. Within the proposed framework, we present different mechanisms to prune subsets of CNNs from the mixture, allowing to easily adapt the computational cost required for inference. Image classification and semantic segmentation experiments show that our method achieve excellent accuracy-compute trade-offs. Moreover, unlike most of previous approaches, a single CNMM provides a large range of operating points along this trade-off, without any re-training.

show abstract

Section: Architecture Searchmentioning

confidence: 99%

“…Our work is motivated by differentiable architecture search [34,37,26,1], which is based on the continuous relaxation of the architecture representation, allowing efficient search with gradient descent. [34,37] propose a gridlike network as the search space, while [26] relax the search space to be continuous and search the space by solving a bilevel optimization problem. Other works in architecture search employ reinforcement learning [4,49], evolutionary algorithms [31,39,25], and sequential model-based optimization [29,24] to search the discrete space.…”

Section: Architecture Searchmentioning

confidence: 99%

SparseMask: Differentiable Connectivity Learning for Dense Image Prediction

Zhang

Huang

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

In this paper, we aim at automatically searching an efficient network architecture for dense image prediction. Particularly, we follow the encoder-decoder style and focus on designing a connectivity structure for the decoder. To achieve that, we design a densely connected network with learnable connections, named Fully Dense Network, which contains a large set of possible final connectivity structures. We then employ gradient descent to search the optimal connectivity from the dense connections. The search process is guided by a novel loss function, which pushes the weight of each connection to be binary and the connections to be sparse. The discovered connectivity achieves competitive results on two segmentation datasets, while runs more than three times faster and requires less than half parameters compared to the state-of-the-art methods. An extensive experiment shows that the discovered connectivity is compatible with various backbones and generalizes well to other dense image prediction tasks. Code is available at https://github.com/wuhuikai/SparseMask.

show abstract

Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks

Cited by 75 publications

References 8 publications

Searching for a Robust Neural Architecture in Four GPU Hours

Searching for a Robust Neural Architecture in Four GPU Hours

Adaptative Inference Cost With Convolutional Neural Mixture Models

SparseMask: Differentiable Connectivity Learning for Dense Image Prediction

Contact Info

Product

Resources

About