Convolutional neural networks (CNNs) have become continually deeper. With the increasing depth of CNNs, the invalid calculations caused by padding-zero operations, filling-zero operations and stride length (stride length>1) represent an increasing proportion of all calculations. To adapt to different CNNs and to eliminate the influences of padding-zero operations, filling-zero operations and stride length on the computational efficiency of the accelerator, we draw upon the computation pattern of CPUs to design an efficient and versatile CNN accelerator, LACS (Loading-Addressing-Computing-Storing). We reduce the amount of data movements between registers and the on-chip buffer from O(k × k) to O(k) by a bypass buffer mechanism. Finally, we deploy LACS on a field-programmable gate array (FPGA) chip and analyze the factors that affect the computational efficiency of LACS. We also run popular CNNs on LACS. The results show that LACS achieves an extremely high computational efficiency, 98.51% when executing AlexNet and 99.66% when executing VGG-16, significantly exceeding state-of-the-art accelerators. INDEX TERMS Accelerator, convolutional neural networks (CNNs), field-programmable gate array (FPGA), buffer mechanism.
Remote sensing techniques are becoming more sophisticated as radar imaging techniques mature. Synthetic aperture radar (SAR) can now provide high-resolution images for day-and-night earth observation. Detecting objects in SAR images is increasingly playing a significant role in a series of applications. In this paper, we address an edge detection problem that applies to scenarios with ship-like objects, where the detection accuracy and efficiency must be considered together. The key to ship detection lies in feature extraction. To efficiently extract features, many existing studies have proposed lightweight neural networks by pruning well-known models in the computer vision field. We found that although different baseline models have been tailored, a large amount of computation is still required. In order to achieve a lighter neural network-based ship detector, we propose Darts_Tiny, a novel differentiable neural architecture search model, to design dedicated convolutional neural networks automatically. Darts_Tiny is customized from Darts. It prunes superfluous operations to simplify the search model and adopts a computation-aware search process to enhance the detection efficiency. The computation-aware search process not only integrates a scheme cutting down the number of channels on purpose but also adopts a synthetic loss function combining the cross-entropy loss and the amount of computation. Comprehensive experiments are conducted to evaluate Darts_Tiny on two open datasets, HRSID and SSDD. Experimental results demonstrate that our neural networks win by at least an order of magnitude in terms of model complexity compared with SOTA lightweight models. A representative model obtained from Darts_Tiny (158 KB model volume, 28 K parameters and 0.58 G computations) yields a faster detection speed such that more than 750 frames per second (800×800 SAR images) could be achieved when testing on a platform equipped with an Nvidia Tesla V100 and an Intel Xeon Platinum 8260. The lightweight neural networks generated by Darts_Tiny are still competitive in detection accuracy: the F1 score can still reach more than 83 and 90, respectively, on HRSID and SSDD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.