2021
DOI: 10.1007/978-3-030-86383-8_44
|View full text |Cite
|
Sign up to set email alerts
|

EPE-NAS: Efficient Performance Estimation Without Training for Neural Architecture Search

Abstract: Neural Architecture Search (NAS) has shown excellent results in designing architectures for computer vision problems. NAS alleviates the need for human-defined settings by automating architecture design and engineering. However, NAS methods tend to be slow, as they require large amounts of GPU computation. This bottleneck is mainly due to the performance estimation strategy, which requires the evaluation of the generated architectures, mainly by training them, to update the sampler method. In this paper, we pr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 34 publications
(21 citation statements)
references
References 19 publications
0
21
0
Order By: Relevance
“…In this section, we further provide the comparison between our proposed ZiCo and more proxies proposed recently: KNAS ), NASWOT (Lopes et al (2021)), GradSign ( Zhang & Jia (2022)), and NTK (TE-NAS Chen et al (2021b), NASI Shu et al (2022a)). To compute the correlations, we use the official code released by the authors of the above papers to obtain the values Table 4: The correlation coefficients between various zero-cost proxies and two naive proxies (#Params and FLOPs) vs. test accuracy on NATSBench-SSS and NATSBench-TSS (KT and SPR represent Kendall's τ and Spearman's ρ, respectively).…”
Section: E Supplementary Results On Nas Benchmarks E1 Comparison With...mentioning
confidence: 99%
See 1 more Smart Citation
“…In this section, we further provide the comparison between our proposed ZiCo and more proxies proposed recently: KNAS ), NASWOT (Lopes et al (2021)), GradSign ( Zhang & Jia (2022)), and NTK (TE-NAS Chen et al (2021b), NASI Shu et al (2022a)). To compute the correlations, we use the official code released by the authors of the above papers to obtain the values Table 4: The correlation coefficients between various zero-cost proxies and two naive proxies (#Params and FLOPs) vs. test accuracy on NATSBench-SSS and NATSBench-TSS (KT and SPR represent Kendall's τ and Spearman's ρ, respectively).…”
Section: E Supplementary Results On Nas Benchmarks E1 Comparison With...mentioning
confidence: 99%
“…Moreover, the Zen-score approximates the gradient w.r.t featuremaps and measures the complexity of neural networks ; . Furthermore, Jacob cov leverages the Jacobian matrix between the loss and multiple input samples to quantify the capacity of modeling the complex functions Lopes et al (2021).…”
Section: Zero-shot Nasmentioning
confidence: 99%
“…Model-based predictors predict the final validation accuracy of an architecture based on its encodings [23,30,61]. Zero-cost proxies look at architectures at initialization stage, and calculate statistics that correlate with the architecture's final validation accuracy [6,39,45]. White et.…”
Section: Related Workmentioning
confidence: 99%
“…The validation accuracy is used as an indication if the architecture is capable of learning from the small number of examples shown, which ultimately can be used to distinguish between architectures that can be trained efficiently from those that can't. The proposed method then looks at the capability of the untrained architecture at initialization stage to model complex functions, through Jacobian analysis [39,45]. For this, one can define a mapping from the input x i ∈ R D , through the net-work, w(x i ), where x i represents an image that belongs to a batch X, and D is the input dimension.…”
Section: Performance Estimation Mechanismmentioning
confidence: 99%
“…One of the main problem is that training a model to evaluate its performance is time consuming, resulting in a huge computation time, which can take days even using hundred of GPUs [22]. Therefore methods exists to bypass this learning phase and evaluate an architecture from metrics based on the distribution of cells activation relative to different inputs values gather in a mini-batch, such as Mellor's metric [22] and Lopes et al [23] proposal.…”
Section: Introductionmentioning
confidence: 99%