Proceedings of the 17th ACM International Conference on Computing Frontiers 2020
DOI: 10.1145/3387902.3394038
|View full text |Cite
|
Sign up to set email alerts
|

Enabling mixed-precision quantized neural networks in extreme-edge devices

Abstract: The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized software to exploit digital signal processing (DSP) extensions of modern instruction set architectures (ISA). As such, recent research proposed optimized libraries for QNNs (from 8-bit to 2-bit) such as CMSIS-NN and PULP-NN. This work presents an extension to the PULP-NN library targeting the acceleration of mixed-precision Deep Neural Networks, an emerging paradigm able to significantly shrink the memory footprin… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
23
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 23 publications
(28 citation statements)
references
References 6 publications
0
23
0
1
Order By: Relevance
“…. By selecting the position of the fixed point, standard base-2 LNS can represent implicit bases such as 8 √ 2, 4 √ 2, 2 √ 2, 2, 4, 8, and so on. Remarkably, despite a very extensive literature on LNS over several decades, we have been unable to find a statement of this simple identity in the literature.…”
Section: Bases and Base Aliasing In Lnsmentioning
confidence: 99%
See 1 more Smart Citation
“…. By selecting the position of the fixed point, standard base-2 LNS can represent implicit bases such as 8 √ 2, 4 √ 2, 2 √ 2, 2, 4, 8, and so on. Remarkably, despite a very extensive literature on LNS over several decades, we have been unable to find a statement of this simple identity in the literature.…”
Section: Bases and Base Aliasing In Lnsmentioning
confidence: 99%
“…Smaller word sizes can reduce the memory footprint of data, and the complexity of arithmetic for a variety of number systems [9,14,17,33], which will have a significant impact on the ability to deploy systems in resource constrained embedded devices deployed on the edge of networks. Further, fixed-point number systems with very short word lengths have been proposed in the literature for a variety of signal processing applications [2,24], while shorter floating-and fixedpoint numbers have also been used for neural networks [8,18,19,39,43,44].…”
Section: Introductionmentioning
confidence: 99%
“…However, the extensions proposed in Garofalo et al [10] only tackle part of the challenge, lacking featuring support for mixed-precision operations. Mixed-precision execution requires data conversion and packing/unpacking operations leading to significant overheads if not natively supported by the underlying hardware [11]. When applied to DNNs, exploiting mixed-precision computations on stateof-the-art processors dramatically reduces the memory footprint enabling the execution of MobileNets on tiny endnodes.…”
Section: Introductionmentioning
confidence: 99%
“…Some researches exploit the property of DNN to reduce latency by using the parallel characteristics of special acceleration circuit design, such as [ 8 , 9 , 10 , 11 , 12 , 13 , 14 ]. Yet these works ignore that the whole power consumption exceeds budget.…”
Section: Introductionmentioning
confidence: 99%
“…To alleviate the poor-performance problems, a number of studies have been undertaken to accelerate DNN implementations by designing hardware-accelerated intelligent computing architecture for sensing system. Some researches exploit the property of DNN to reduce latency by using the parallel characteristics of special acceleration circuit design, such as [8][9][10][11][12][13][14]. Yet these works ignore that the whole power consumption exceeds budget.…”
Section: Introductionmentioning
confidence: 99%