UNAS: Differentiable Architecture Search Meets Reinforcement Learning

Vahdat, Arash; Mallya, Arun; Li, Mingyu; Kautz, Jan

doi:10.48550/arxiv.1912.07651

Cited by 1 publication

(1 citation statement)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, while EfficientNets vastly outperform ResNets in terms of theoretical training efficiency, they have often been found to underperform when considering practical training efficiency on GPUs (Lee et al, 2020). Some recent work has used NAS to optimise practical efficiency on GPUs (Cai et al, 2018;Vahdat et al, 2019;Lin et al, 2020). For the presented work, we prioritised hang-engineered solutions, but do not rule out NAS methods in future work.…”

Section: Efficient Cnnsmentioning

confidence: 99%

Making EfficientNet More Efficient: Exploring Batch-Independent Normalization, Group Convolutions and Reduced Resolution Training

Masters,

Labatie,

Eaton-Rosen

et al. 2021

Preprint

View full text Add to dashboard Cite

Much recent research has been dedicated to improving the efficiency of training and inference for image classification. This effort has commonly focused on explicitly improving theoretical efficiency, often measured as ImageNet validation accuracy per FLOP. These theoretical savings have, however, proven challenging to achieve in practice, particularly on high-performance training accelerators.In this work, we focus on improving the practical efficiency of the state-of-the-art EfficientNet models on a new class of accelerator, the Graphcore IPU. We do this by extending this family of models in the following ways: (i) generalising depthwise convolutions to group convolutions; (ii) adding proxy-normalized activations to match batch normalization performance with batch-independent statistics; (iii) reducing compute by lowering the training resolution and inexpensively fine-tuning at higher resolution. We find that these three methods improve the practical efficiency for both training and inference. Our code will be made available online.

show abstract

Section: Efficient Cnnsmentioning

confidence: 99%