A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

Burrello, Alessio; Scherer, Moritz; Zanghieri, Marcello; Conti, Francesco; Benini, Luca

doi:10.1109/coins51742.2021.9524173

Cited by 19 publications

(8 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We deployed the quantized models on the GAP8 SoC [ 15 ] using the optimized library of kernels (i.e., layers implementations) described in [ 44 ] for MHSA, and in [ 45 ] for convolutions. GAP8 is a commercial SoC from GreenWaves Technologies, which inlcudes a controller unit called fabric controller (FC), composed of a single RISC-V core, which manages the peripherals and orchestrates the program execution, and a cluster of 8 identical RISC-V cores (with a shared 64 kB scratchpad memory) which accelerates intensive workloads.…”

Section: Resultsmentioning

confidence: 99%

Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference

Xie

Burrello

Daghero

et al. 2023

Sensors

Self Cite

View full text Add to dashboard Cite

Hand gesture recognition applications based on surface electromiographic (sEMG) signals can benefit from on-device execution to achieve faster and more predictable response times and higher energy efficiency. However, deploying state-of-the-art deep learning (DL) models for this task on memory-constrained and battery-operated edge devices, such as wearables, requires a careful optimization process, both at design time, with an appropriate tuning of the DL models’ architectures, and at execution time, where the execution of large and computationally complex models should be avoided unless strictly needed. In this work, we pursue both optimization targets, proposing a novel gesture recognition system that improves upon the state-of-the-art models both in terms of accuracy and efficiency. At the level of DL model architecture, we apply for the first time tiny transformer models (which we call bioformers) to sEMG-based gesture recognition. Through an extensive architecture exploration, we show that our most accurate bioformer achieves a higher classification accuracy on the popular Non-Invasive Adaptive hand Prosthetics Database 6 (Ninapro DB6) dataset compared to the state-of-the-art convolutional neural network (CNN) TEMPONet (+3.1%). When deployed on the RISC-V-based low-power system-on-chip (SoC) GAP8, bioformers that outperform TEMPONet in accuracy consume 7.8×–44.5× less energy per inference. At runtime, we propose a three-level dynamic inference approach that combines a shallow classifier, i.e., a random forest (RF) implementing a simple “rest detector” with two bioformers of different accuracy and complexity, which are sequentially applied to each new input, stopping the classification early for “easy” data. With this mechanism, we obtain a flexible inference system, capable of working in many different operating points in terms of accuracy and average energy consumption. On GAP8, we obtain a further 1.03×–1.35× energy reduction compared to static bioformers at iso-accuracy.

show abstract

Section: Resultsmentioning

confidence: 99%

Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference

Xie

Burrello

Daghero

et al. 2023

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

“…Instead of using special attention and transformer blocks, transformer knowledge distillation teaches a small transformer to mimic the behavior of a larger transformer, allowing up to 7.5× smaller and 9.4× faster inference over bidirectional encoder representations from transformers [118]. Customized data layout and loop reordering of each attention kernel, coupled with quantization, have allowed porting transformers onto microcontrollers [119] by minimizing computationally intensive data marshaling operations. The use of depthwise and pointwise convolution has been shown to yield autoencoder architectures as small as 2.7 kB for anomaly detection [120].…”

Section: F Attention Mechanisms Transformers and Autoencodersmentioning

confidence: 99%

Machine Learning for Microcontroller-Class Hardware: A Review

2022

View full text Add to dashboard Cite

The advancements in machine learning (ML) opened a new opportunity to bring intelligence to the low-end Internet-of-Things (IoT) nodes, such as microcontrollers. Conventional ML deployment has high memory and computes footprint hindering their direct deployment on ultraresource-constrained microcontrollers. This article highlights the unique requirements of enabling onboard ML for microcontroller-class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure that the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of ML model development for microcontroller-class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.

show abstract

“…Advancing both signal processing and neural network applications to edge-devices are also emerging for new embedded platforms that integrate multiple cores in parallel, such as GreenWaves Technologies GAP-8 and GAP-9 (i.e., nine RISC-V cores) [145], to enable embedded machine learning in battery-operated IoT sensors, mainly focusing image processing domain. Such systems-on-chip (SoCs) are among the most advanced low-power edge nodes available in the market, embodying the PULP architectural paradigm with DSP-enhanced RISC-V cores, while frameworks have been developed exploiting SoCs features, such as hardware loops, post-modified access LD/ST, and SIMD instructions down to 8-bit vector operands [146] [147]. Additionally, to provide agility for a variety of different neural network techniques, a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, has been proposed [148].…”

Section: B Hardware-assisted ML In Iot Devicesmentioning

confidence: 99%

Hardware-Assisted Machine Learning in Resource-Constrained IoT Environments for Security: Review and Future Prospective

Kornaros¹

2022

IEEE Access

View full text Add to dashboard Cite

As the Internet of Things (IoT) technology advances, billions of multidisciplinary smart devices act in concert, rarely requiring human intervention, posing significant challenges in supporting trusted computing and user privacy, as well as protecting against attacks such as spoofing, denial of service (DoS), jamming, and eavesdropping. To tackle attacks on the IoT and cyber-physical ecosystem, many intrusion detection and security approaches have been presented in the literature. Machine learning (ML) based intrusion and anomaly detection has lately gained traction due to its capacity to cope with encrypted and rapidly developing threat techniques. This work investigates into machine learning (ML) and deep learning (DL) methodologies for IoT device security and examine the benefits, drawbacks, and potential. To protect an IoT infrastructure, various solutions look into hardware-based methods for ML-based IoT authentication, access control, secure offloading, and malware detection schemes. This review aims to illuminate the value of various approaches for addressing IoT security in a truly effective, flexible, and seamless manner, as well as to provide answers to questions about tradeoffs in integrating accelerators and customizing embedded device architectures for effective use of ML-based methods. INDEX TERMSAI-based IoT security, Hardware-based machine learning, IoT intrusion detection, Trusted embedded devices I. INTRODUCTION

show abstract

A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

Cited by 19 publications

References 7 publications

Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference

Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference

Machine Learning for Microcontroller-Class Hardware: A Review

Hardware-Assisted Machine Learning in Resource-Constrained IoT Environments for Security: Review and Future Prospective

Contact Info

Product

Resources

About