EPE-NAS: Efficient Performance Estimation Without Training for Neural Architecture Search

Lopes, Vasco; Alirezazadeh, Saeid; Alexandre, Luı́s A.

doi:10.1007/978-3-030-86383-8_44

Cited by 34 publications

(21 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we further provide the comparison between our proposed ZiCo and more proxies proposed recently: KNAS ), NASWOT (Lopes et al (2021)), GradSign ( Zhang & Jia (2022)), and NTK (TE-NAS Chen et al (2021b), NASI Shu et al (2022a)). To compute the correlations, we use the official code released by the authors of the above papers to obtain the values Table 4: The correlation coefficients between various zero-cost proxies and two naive proxies (#Params and FLOPs) vs. test accuracy on NATSBench-SSS and NATSBench-TSS (KT and SPR represent Kendall's τ and Spearman's ρ, respectively).…”

Section: E Supplementary Results On Nas Benchmarks E1 Comparison With...mentioning

confidence: 99%

See 1 more Smart Citation

ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Li¹,

Yang²,

Bhardwaj³

et al. 2023

Preprint

View full text Add to dashboard Cite

Neural Architecture Search (NAS) is widely used to automatically design the neural network with the best performance among a large number of candidate architectures. To reduce the search time, zero-shot NAS aims at designing training-free proxies that can predict the test performance of a given architecture. However, as shown recently, none of the zero-shot proxies proposed to date can actually work consistently better than a naive proxy, namely, the number of network parameters (#Params). To improve this state of affairs, as the main theoretical contribution, we first reveal how some specific gradient properties across different samples impact the convergence rate and generalization capacity of neural networks. Based on this theoretical analysis, we propose a new zero-shot proxy, ZiCo, the first proxy that works consistently better than #Params. We demonstrate that ZiCo works better than State-Of-The-Art (SOTA) proxies on several popular NAS-Benchmarks (NASBench101, NATSBench-SSS/TSS, TransNASBench-101) for multiple applications (e.g., image classification/reconstruction and pixel-level prediction). Finally, we demonstrate that the optimal architectures found via ZiCo are as competitive as the ones found by one-shot and multi-shot NAS methods, but with much less search time. For example, ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs on ImageNet within 0.4 GPU days.

show abstract

Section: E Supplementary Results On Nas Benchmarks E1 Comparison With...mentioning

confidence: 99%

“…Moreover, the Zen-score approximates the gradient w.r.t featuremaps and measures the complexity of neural networks ; . Furthermore, Jacob cov leverages the Jacobian matrix between the loss and multiple input samples to quantify the capacity of modeling the complex functions Lopes et al (2021).…”

Section: Zero-shot Nasmentioning

confidence: 99%

ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Li¹,

Yang²,

Bhardwaj³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Model-based predictors predict the final validation accuracy of an architecture based on its encodings [23,30,61]. Zero-cost proxies look at architectures at initialization stage, and calculate statistics that correlate with the architecture's final validation accuracy [6,39,45]. White et.…”

Section: Related Workmentioning

confidence: 99%

“…The validation accuracy is used as an indication if the architecture is capable of learning from the small number of examples shown, which ultimately can be used to distinguish between architectures that can be trained efficiently from those that can't. The proposed method then looks at the capability of the untrained architecture at initialization stage to model complex functions, through Jacobian analysis [39,45]. For this, one can define a mapping from the input x i ∈ R D , through the net-work, w(x i ), where x i represents an image that belongs to a batch X, and D is the input dimension.…”

Section: Performance Estimation Mechanismmentioning

confidence: 99%

Towards Less Constrained Macro-Neural Architecture Search

Lopes¹,

Alexandre²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Networks found with Neural Architecture Search (NAS) achieve state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, most NAS methods heavily rely on human-defined assumptions that constrain the search: architecture's outer-skeletons, number of layers, parameter heuristics and search spaces. Additionally, common search spaces consist of repeatable modules (cells) instead of fully exploring the architecture's search space by designing entire architectures (macrosearch). Imposing such constraints requires deep human expertise and restricts the search to pre-defined settings. In this paper, we propose LCMNAS, a method that pushes NAS to less constrained search spaces by performing macrosearch without relying on pre-defined heuristics or bounded search spaces. LCMNAS introduces three components for the NAS pipeline: i) a method that leverages information about well-known architectures to autonomously generate complex search spaces based on Weighted Directed Graphs with hidden properties, ii) a evolutionary search strategy that generates complete architectures from scratch, and iii) a mixed-performance estimation approach that combines information about architectures at initialization stage and lower fidelity estimates to infer their trainability and capacity to model complex functions. We present experiments showing that LCMNAS generates state-of-the-art architectures from scratch with minimal GPU computation. We study the importance of different NAS components on a macro-search setting. Code for reproducibility is public at https://github.com/VascoLopes/LCMNAS.

show abstract

“…One of the main problem is that training a model to evaluate its performance is time consuming, resulting in a huge computation time, which can take days even using hundred of GPUs [22]. Therefore methods exists to bypass this learning phase and evaluate an architecture from metrics based on the distribution of cells activation relative to different inputs values gather in a mini-batch, such as Mellor's metric [22] and Lopes et al [23] proposal.…”

Section: Introductionmentioning

confidence: 99%

Improving Neural Architecture Search by Mixing a FireFly algorithm with a Training Free Evaluation

Mokhtari

Nédélec

Gilles

et al. 2022

2022 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

Neural Architecture Search (NAS) algorithms are used to automate the design of deep neural networks. Finding the best architecture for a given dataset can be time consuming since these algorithms have to explore a large number of networks, and score them according to their performances to choose the most appropriate one. In this work, we propose a novel metric that uses the Intra-Cluster Distance (ICD) score to evaluate the ability of an untrained model to distinguish between data in order to approximate its quality. We also use an improved version of the FireFly algorithm, more robust to the local optimums problem than the baseline FireFly algorithm, as a search technique to find the best neural network model adapted to a specific dataset. Experimental results on the different NAS Benchmarks show that our metric is valid for either scoring CNNs and RNNs, and that our proposed FireFly algorithm can improve the result obtained by the state-of-art training-free methods.

show abstract

EPE-NAS: Efficient Performance Estimation Without Training for Neural Architecture Search

Cited by 34 publications

References 19 publications

ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Towards Less Constrained Macro-Neural Architecture Search

Improving Neural Architecture Search by Mixing a FireFly algorithm with a Training Free Evaluation

Contact Info

Product

Resources

About