Graph pruning for model compression

Zhang, Mingyang; Yu, Xinyi; Rong, Jingtao; Liu, Ou

doi:10.1007/s10489-021-02802-8

Cited by 7 publications

(6 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We refer to [54] for a more comprehensive NAS review. Block-wise weight-sharing NAS [36,46,83,84] approaches factorize the supernet into independently optimized blocks and thus reduce the weight-sharing space, resolving the issue of inaccurate architecture ratings caused by weightsharing. DNA [36] first introduced the block-wisely supervised architecture rating scheme with knowledge distillation.…”

Section: Related Workmentioning

confidence: 99%

“…Based on this scheme, DONNA [46] further propose to predict an architecture rating using a linear combination of its blockwise ratings rather than a simplistic sum. SP [83] were the first to apply this scheme to network pruning. However, all of the aforementioned methods rely on a supervised distillation scheme, which inevitably introduces architectural bias from the teacher.…”

Section: Related Workmentioning

confidence: 99%

“…In this section, we first briefly introduce the dilemma of NAS and its block-wise solutions [36,46,83,84], then present our proposed BossNAS in detail, along with its two key elements: i) unsupervised supernet training phase with ensemble bootstrapping; ii) unsupervised architecture rating and searching phase towards architecture population center. Notations.…”

Section: Block-wisely Self-supervised Nasmentioning

confidence: 99%

“…total number of weight-sharing architectures) can effectively improve the accuracy of architecture rating. In practice, block-wise solutions [36,46,83,84] find a way out of this dilemma of NAS by block-wisely factorizing the search space in the depth dimension, thus reducing the weight-sharing space while maintaining the original size of the search space. Given a supernet consisting of |k| blocks…”

Section: Dilemma Of Nas and The Block-wise Solutionsmentioning

confidence: 99%

See 3 more Smart Citations

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Li¹,

Tang²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

A myriad of recent breakthroughs in hand-crafted neural architectures for visual recognition have highlighted the urgent need to explore hybrid architectures consisting of diversified building blocks. Meanwhile, neural architecture search methods are surging with an expectation to reduce human efforts. However, whether NAS methods can efficiently and effectively handle diversified search spaces with disparate candidates (e.g. CNNs and transformers) is still an open question. In this work, we present Block-wisely Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS method that addresses the problem of inaccurate architecture rating caused by large weight-sharing space and biased supervision in previous methods. More specifically, we factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately before searching them as a whole towards the population center. Additionally, we present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable downsampling positions. On this challenging search space, our searched model, BossNet-T, achieves up to 82.2% accuracy on ImageNet, surpassing EfficientNet by 2.1% with comparable compute time. Moreover, our method achieves superior architecture rating accuracy with 0.78 and 0.76 Spearman correlation on the canonical MBConv search space with Im-ageNet and on NATS-Bench size search space with CIFAR-100, respectively, surpassing state-of-the-art NAS methods. 1

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Block-wisely Self-supervised Nasmentioning

confidence: 99%

Section: Dilemma Of Nas and The Block-wise Solutionsmentioning

confidence: 99%

See 2 more Smart Citations

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Li¹,

Tang²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…H. Zhu and Suyog Gupta showed comparison between the performance of large-sparse models and small dense model on a large variety of datasets and shows that large sparse model outperforms the former. Zhang, et al in their paper Graph Pruning for Model Compression discuss in filter pruning used in conjunction with graph convolution [3]. Filter pruning is a method in which selected filters are removed and a narrower model is rebuilt.…”

Section: Related Workmentioning

confidence: 99%

Model Compression

Ishtiaq,

Mahmood,

Anees

et al. 2021

Preprint

View full text Add to dashboard Cite

With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This paper aims to explore the possibilities within the domain of model compression and discuss the efficiency of each of the possible approaches while comparing model size and performance with respect to pre-and post-compression.

show abstract