2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS) 2021
DOI: 10.1109/coins51742.2021.9524173
|View full text |Cite
|
Sign up to set email alerts
|

A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 7 publications
0
8
0
Order By: Relevance
“…We deployed the quantized models on the GAP8 SoC [ 15 ] using the optimized library of kernels (i.e., layers implementations) described in [ 44 ] for MHSA, and in [ 45 ] for convolutions. GAP8 is a commercial SoC from GreenWaves Technologies, which inlcudes a controller unit called fabric controller (FC), composed of a single RISC-V core, which manages the peripherals and orchestrates the program execution, and a cluster of 8 identical RISC-V cores (with a shared 64 kB scratchpad memory) which accelerates intensive workloads.…”
Section: Resultsmentioning
confidence: 99%
“…We deployed the quantized models on the GAP8 SoC [ 15 ] using the optimized library of kernels (i.e., layers implementations) described in [ 44 ] for MHSA, and in [ 45 ] for convolutions. GAP8 is a commercial SoC from GreenWaves Technologies, which inlcudes a controller unit called fabric controller (FC), composed of a single RISC-V core, which manages the peripherals and orchestrates the program execution, and a cluster of 8 identical RISC-V cores (with a shared 64 kB scratchpad memory) which accelerates intensive workloads.…”
Section: Resultsmentioning
confidence: 99%
“…Instead of using special attention and transformer blocks, transformer knowledge distillation teaches a small transformer to mimic the behavior of a larger transformer, allowing up to 7.5× smaller and 9.4× faster inference over bidirectional encoder representations from transformers [118]. Customized data layout and loop reordering of each attention kernel, coupled with quantization, have allowed porting transformers onto microcontrollers [119] by minimizing computationally intensive data marshaling operations. The use of depthwise and pointwise convolution has been shown to yield autoencoder architectures as small as 2.7 kB for anomaly detection [120].…”
Section: F Attention Mechanisms Transformers and Autoencodersmentioning
confidence: 99%
“…Advancing both signal processing and neural network applications to edge-devices are also emerging for new embedded platforms that integrate multiple cores in parallel, such as GreenWaves Technologies GAP-8 and GAP-9 (i.e., nine RISC-V cores) [145], to enable embedded machine learning in battery-operated IoT sensors, mainly focusing image processing domain. Such systems-on-chip (SoCs) are among the most advanced low-power edge nodes available in the market, embodying the PULP architectural paradigm with DSP-enhanced RISC-V cores, while frameworks have been developed exploiting SoCs features, such as hardware loops, post-modified access LD/ST, and SIMD instructions down to 8-bit vector operands [146] [147]. Additionally, to provide agility for a variety of different neural network techniques, a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, has been proposed [148].…”
Section: B Hardware-assisted ML In Iot Devicesmentioning
confidence: 99%