Jiangwei Shang scite author profile

Jiangwei Shang

5Publications

12Citation Statements Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Harbin Institute of Technology, Southeast University

Publications

Order By: Most citations

LACS: A High-Computational-Efficiency Accelerator for CNNs

Shang

Qian²,

Zhang

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Convolutional neural networks (CNNs) have become continually deeper. With the increasing depth of CNNs, the invalid calculations caused by padding-zero operations, filling-zero operations and stride length (stride length>1) represent an increasing proportion of all calculations. To adapt to different CNNs and to eliminate the influences of padding-zero operations, filling-zero operations and stride length on the computational efficiency of the accelerator, we draw upon the computation pattern of CPUs to design an efficient and versatile CNN accelerator, LACS (Loading-Addressing-Computing-Storing). We reduce the amount of data movements between registers and the on-chip buffer from O(k × k) to O(k) by a bypass buffer mechanism. Finally, we deploy LACS on a field-programmable gate array (FPGA) chip and analyze the factors that affect the computational efficiency of LACS. We also run popular CNNs on LACS. The results show that LACS achieves an extremely high computational efficiency, 98.51% when executing AlexNet and 99.66% when executing VGG-16, significantly exceeding state-of-the-art accelerators. INDEX TERMS Accelerator, convolutional neural networks (CNNs), field-programmable gate array (FPGA), buffer mechanism.

show abstract

A high-performance convolution block oriented accelerator for MBConv-Based CNNs

Shang

Zhang²,

Zhang

et al. 2023

Integration

View full text Add to dashboard Cite

ANNA: Accelerating Neural Network Accelerator through software-hardware co-design for vertical applications in edge systems

Zhang²,

Li³

et al. 2023

Future Generation Computer Systems

View full text Add to dashboard Cite

A Software/Hardware Co-design Local Irregular Sparsity Method for Accelerating CNNs on FPGA

Shang

Zhang

et al. 2022

View full text Add to dashboard Cite

Efficient Object Detection in SAR Images Based on Computation-Aware Neural Architecture Search

et al. 2022

Applied Sciences

View full text Add to dashboard Cite

Remote sensing techniques are becoming more sophisticated as radar imaging techniques mature. Synthetic aperture radar (SAR) can now provide high-resolution images for day-and-night earth observation. Detecting objects in SAR images is increasingly playing a significant role in a series of applications. In this paper, we address an edge detection problem that applies to scenarios with ship-like objects, where the detection accuracy and efficiency must be considered together. The key to ship detection lies in feature extraction. To efficiently extract features, many existing studies have proposed lightweight neural networks by pruning well-known models in the computer vision field. We found that although different baseline models have been tailored, a large amount of computation is still required. In order to achieve a lighter neural network-based ship detector, we propose Darts_Tiny, a novel differentiable neural architecture search model, to design dedicated convolutional neural networks automatically. Darts_Tiny is customized from Darts. It prunes superfluous operations to simplify the search model and adopts a computation-aware search process to enhance the detection efficiency. The computation-aware search process not only integrates a scheme cutting down the number of channels on purpose but also adopts a synthetic loss function combining the cross-entropy loss and the amount of computation. Comprehensive experiments are conducted to evaluate Darts_Tiny on two open datasets, HRSID and SSDD. Experimental results demonstrate that our neural networks win by at least an order of magnitude in terms of model complexity compared with SOTA lightweight models. A representative model obtained from Darts_Tiny (158 KB model volume, 28 K parameters and 0.58 G computations) yields a faster detection speed such that more than 750 frames per second (800×800 SAR images) could be achieved when testing on a platform equipped with an Nvidia Tesla V100 and an Intel Xeon Platinum 8260. The lightweight neural networks generated by Darts_Tiny are still competitive in detection accuracy: the F1 score can still reach more than 83 and 90, respectively, on HRSID and SSDD.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jiangwei Shang

LACS: A High-Computational-Efficiency Accelerator for CNNs

A high-performance convolution block oriented accelerator for MBConv-Based CNNs

ANNA: Accelerating Neural Network Accelerator through software-hardware co-design for vertical applications in edge systems

A Software/Hardware Co-design Local Irregular Sparsity Method for Accelerating CNNs on FPGA

Efficient Object Detection in SAR Images Based on Computation-Aware Neural Architecture Search

Contact Info

Product

Resources

About