2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01204
|View full text |Cite
|
Sign up to set email alerts
|

Butterfly Transform: An Efficient FFT Based Neural Architecture Design

Abstract: In this paper, we introduce the Butterfly Transform (BFT ), a light weight channel fusion method that reduces the computational complexity of point-wise convolutions from O(n 2 ) of conventional solutions to O(n log n) with respect to the number of channels while improving the accuracy of the networks under the same range of FLOPs. The proposed BFT generalizes the Discrete Fourier Transform in a way that its parameters are learned at training time. Our experimental evaluations show that replacing channel fusio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
22
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(23 citation statements)
references
References 34 publications
1
22
0
Order By: Relevance
“…Residual connections have been proposed to connect the butterfly factors [Vahid et al, 2020]. We show that residual products of butterfly matrices have a first-order approximation as a sparse matrix with a fixed sparsity.…”
Section: Flat Butterfly Matricesmentioning
confidence: 98%
“…Residual connections have been proposed to connect the butterfly factors [Vahid et al, 2020]. We show that residual products of butterfly matrices have a first-order approximation as a sparse matrix with a fixed sparsity.…”
Section: Flat Butterfly Matricesmentioning
confidence: 98%
“…General compressing methods for DCNNs include: 1) Quantization: Although quantization methods don't reduce the number of operations, they can reduce the DCNN model and the computation cost by altering floating-point to fixed-point operations that use simple circuitry in hardware. DoReFaNet [24] and QKeras [25] are frameworks that allow quantizing both weights and feature maps (fmaps) to any [3] (C) Butterfly Transform [17] (D) ShuffleNet [18] (E) SqueezeNet [19] (F) Low-Rank Expansion [20] (G) PermDNN [21] (H) CSC blocks (This work) Scheme-1 Scheme-2 Fig. 3.…”
Section: Related Workmentioning
confidence: 99%
“…Below each graph, the connectivity matrices from left to right, represent the topology of the graph from right to left. Compact models such as Low-Rank Expansion [20], MobileNet [3], ShuffleNet [18], SqueezNet [19], PermDNN [21], PermCNN [9], and Butterfly Transform [17], as illustrated in Fig. 3 introduce alternate pre-sparsified layers with a common intuition: in all of them, each model proposes a pre-defined factorization that can be equated to a standard DCNN layer.…”
Section: Related Workmentioning
confidence: 99%
“…Second, the performance gains of CNNs might come at a high computational cost. While an abundance of computing resources might be available at the training phase of CNNs, the resulting inference engines may be deployed settings such as network edge [ 53 ] that are constrained in terms of computational resources and energy consumption and favor tight coupling between the RF circuits (sensing component) [ 54 , 55 , 56 ]. Unless addressed, the high computation and energy cost of CNNs might be a significant limiting factor towards broader adoption.…”
Section: Introductionmentioning
confidence: 99%