DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

Yu, Xiao; Liang, Shuang; Sui, Lingzhi; Jia, Xijie; Qiu, Jiantao; Liu, Xin; Wang, Yushun; Wang, Yu; Shan, Yi

doi:10.48550/arxiv.1902.07463

Cited by 2 publications

(5 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, depthwise separable convolution spends 95% computation time in Conv 1 × 1, which causes a large MAdds gap between two consecutive laysers (Conv 1 × 1 and Conv DW 3×3) [12]. This gap is unfriendly to embedded systems who load all weights of the network to perform convolution [24]: embedded systems need extra buffers for Conv 1 × 1.…”

Section: Approach 21 Variable Group Convolutionmentioning

confidence: 99%

“…Communication between off-chip memory and on-chip memory only happens on the start and the end of block computing when a block is grouped and computed together on embedded systems [24]. To limit the communication cost, VarGNet sets the number of output channels to be same as the number of input channels in the normal block.…”

Section: Blocks Of Variable Group Networkmentioning

confidence: 99%

See 1 more Smart Citation

VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition

Yan

Zhao

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

To improve the discriminative and generalization ability of lightweight network for face recognition, we propose an efficient variable group convolutional network called VarGFaceNet. Variable group convolution is introduced by VarGNet to solve the conflict between small computational cost and the unbalance of computational intensity inside a block. We employ variable group convolution to design our network which can support large scale face identification while reduce computational cost and parameters. Specifically, we use a head setting to reserve essential information at the start of the network and propose a particular embedding setting to reduce parameters of fully-connected layer for embedding. To enhance interpretation ability, we employ an equivalence of angular distillation loss to guide our lightweight network and we apply recursive knowledge distillation to relieve the discrepancy between the teacher model and the student model. The champion of deepglintlight track of LFR (2019) challenge demonstrates the effectiveness of our model and approach. Implementation of VarGFaceNet will be released at https://github.com/zma-c-137/VarGFaceNet soon.

show abstract

Section: Approach 21 Variable Group Convolutionmentioning

confidence: 99%

Section: Blocks Of Variable Group Networkmentioning

confidence: 99%

VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition

Yan

Zhao

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

show abstract

“…Typically, the size of convolutional kernels is much lower than the size of feature maps, such as k 2 C for kernels and 2HW C for feature maps in 2D convolutions. In light of the above two properties, an ingenious solution is to load all the data of kernels first and then perform the convolution with popping and popping out feature data sequentially [48] . Such practical solution is the second intuition for our following two guidelines for efficient network design on embedded systems:…”

Section: Designing Efficient Network On Embedded Systemsmentioning

confidence: 99%

“…Also, in these blocks, residual connections [18] are widely adopted. So, in recent compiler-side optimizations [48] , layers in a block are usually grouped and computed together. In such manner, off-chip memory and on-chip memory only communicates when starting or ending computing a block in the network.…”

Section: Designing Efficient Network On Embedded Systemsmentioning

confidence: 99%

“…For smart Internet of Things applications, the challenging part is that the whole system is required to be both energy-constrained and of small size. To meet the challenge, the work of improving the efficiency of the whole computing process can be roughly broken into two directions: The first is to design lightweight networks which has a small MAdds [20,38,52,30], thus friendly to low power consumption platforms; The second is to optimize hardware-side configurations, such as FPGA based accelerators [13,50], or to make the whole computing process more efficient by improving the compiler and generating more smart instructions [2,6,48].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing

Zhang,

Li,

Yao

et al. 2019

Preprint

View full text Add to dashboard Cite

In this paper, we propose a novel network design mechanism for efficient embedded computing. Inspired by the limited computing patterns, we propose to fix the number of channels in a group convolution, instead of the existing practice that fixing the total group numbers. Our solution based network, named Variable Group Convolutional Network (VarGNet), can be optimized easier on hardware side, due to the more unified computing schemes among the layers. Extensive experiments on various vision tasks, including classification, detection, pixel-wise parsing and face recognition, have demonstrated the practical value of our VarGNet.Preprint. Under review.

show abstract

DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

Cited by 2 publications

References 25 publications

VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition

VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition

VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing

Contact Info

Product

Resources

About