2019
DOI: 10.1007/978-3-030-30709-7_26
|View full text |Cite|
|
Sign up to set email alerts
|

Efficient Processing of Convolutional Neural Networks on SW26010

Abstract: Artificial intelligence has developed rapidly in recent years. Deep neural networks are the basis of many artificial intelligence applications. How to accelerate the computational processing of deep neural networks is very important. To explor the potential for accelerating the process deep neural networks on various hardware platforms, we propose a convolutional neural network optimization method based on the Weight-Stationary for SW26010 processor. We re-circulate convolution loops and use hybrid DMA transmi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
8
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(8 citation statements)
references
References 5 publications
0
8
0
Order By: Relevance
“…To obtain higher peak performance, modern processors not only integrate more and more cores, but also widely apply superscalar technology [29]. Many studies [11], [13], [30], [31] have focused on instruction-level optimization methods based on superscalar technology. The same is true of the SW26010 processor, which has two pipelines (P0 and P1) on one CPE.…”
Section: ) Register Blockingmentioning
confidence: 99%
See 3 more Smart Citations
“…To obtain higher peak performance, modern processors not only integrate more and more cores, but also widely apply superscalar technology [29]. Many studies [11], [13], [30], [31] have focused on instruction-level optimization methods based on superscalar technology. The same is true of the SW26010 processor, which has two pipelines (P0 and P1) on one CPE.…”
Section: ) Register Blockingmentioning
confidence: 99%
“…The execution time of CNNs becomes long and unacceptable as larger data sets and more complex CNNs emerge. Because convolution accounts for more than 90% of the total computation in CNNs [2], highly efficient convolution algorithms on many-core processors have become a popular research direction in academia and industry.…”
mentioning
confidence: 99%
See 2 more Smart Citations
“…ShuffleFaceNet [11] is adapted from the efficient network ShuffleNetV2 [12], and similar to MobileFaceNet, global depth-wise convolution is used to output the facial feature vector. Based on the variable group convolutional proposed in VarGNet [13], VarGFaceNet [14] designed a compact yet high-accurate FR model. Using recursive knowledge distillation, VarGFaceNet [14] achieved 99.85% accuracy on the LFW benchmark and obtained the best results for the ICCV 2019 LFR challenge DeepGlint-Light track [7].…”
Section: Introductionmentioning
confidence: 99%