3PXNet

Romaszkan, Wojciech; Li, Tianmu; Gupta, Puneet

doi:10.1145/3371157

Cited by 7 publications

(10 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Based on XNOR-Net, Ref. [18] constructed a pruned-permuted-packed network that combines binarization with sparsity to push model size reduction to very low limits. On the Nucleo platforms and Raspberry Pi, 3PXNet achieves a reduction in the model size by up to 38× and an improvement in runtime and energy of 25× compared to already compact conventional binarized implementations with a reduction in accuracy of less than 3%.…”

Section: Algorithmic Techniques For Low-power Edge Aimentioning

confidence: 99%

Big–Little Adaptive Neural Networks on Low-Power Near-Subthreshold Processors

Shen

Howard

Nunez-Yanez

2022

JLPEA

View full text Add to dashboard Cite

This paper investigates the energy savings that near-subthreshold processors can obtain in edge AI applications and proposes strategies to improve them while maintaining the accuracy of the application. The selected processors deploy adaptive voltage scaling techniques in which the frequency and voltage levels of the processor core are determined at the run-time. In these systems, embedded RAM and flash memory size is typically limited to less than 1 megabyte to save power. This limited memory imposes restrictions on the complexity of the neural networks model that can be mapped to these devices and the required trade-offs between accuracy and battery life. To address these issues, we propose and evaluate alternative ‘big–little’ neural network strategies to improve battery life while maintaining prediction accuracy. The strategies are applied to a human activity recognition application selected as a demonstrator that shows that compared to the original network, the best configurations obtain an energy reduction measured at 80% while maintaining the original level of inference accuracy.

show abstract

Section: Algorithmic Techniques For Low-power Edge Aimentioning

confidence: 99%

Big–Little Adaptive Neural Networks on Low-Power Near-Subthreshold Processors

Shen

Howard

Nunez-Yanez

2022

JLPEA

View full text Add to dashboard Cite

show abstract

“…Dedicated runtime libraries have also been developed, like ARM CMSIS-NN [43], TensorFlow Lite [21], or Microsoft EdgeML [29,41]. Pruning networks to high sparsity levels to save storage and computation has also been extensively explored [45,57].…”

Section: Accessibility Of Edge MLmentioning

confidence: 99%

“…Besides storage compression, binarization offers impressive performance improvements by replacing integer multiplication with bitwise XNOR operations [17,35,56]. Due to their efficiency, multiple software [34,50,54,66], and hardware [6,7,9,15,26,37,44,65] implementations of binarized neural networks have been proposed; as well as further optimizations, such as combining binarization with pruning [57], or memoization [48]. However, BNNs come with disadvantages, the most important of which is accuracy degradation [25,27].…”

Section: Accessibility Of Edge MLmentioning

confidence: 99%

“…Broad availability, low cost, and ease-of-use would, in turn, make it possible for people to experiment with an increasingly-broader range of applications that improve human-computer interaction, such as audio and visual wake words, context recognition, and user verification [23]. However, achieving it requires unprecedented efforts on co-optimization of algorithms and hardware to make large and computationally complex models usable on devices with very limited memory and processing power [57].…”

Section: Introductionmentioning

confidence: 99%

“…Multiple techniques have been proposed to address model compression: quantization [1,30,46,57], which uses lower precision numbers for more efficient storage and computation, pruning [19,31,57] which removes inconsequential weights, compressed models [33,36], and optimized software libraries [21,43]. While all the above knobs are readily available to machine learning researchers, it is not obvious how they interact with hardware configurations, given the specific set of constraints, e.g., cost, latency, size, and user experience.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Machine Learning-Based Predictive Model to Improve Cloud Application Performance in Cloud SaaS

Sharma¹,

Gupta²

2022

Machine Learning and Optimization Models for Optimization in Cloud

View full text Add to dashboard Cite

Researchers have long touted a vision of the future enabled by a proliferation of internet-of-things devices, including smart sensors, homes, and cities. Increasingly, embedding intelligence in such devices involves the use of deep neural networks. However, their storage and processing requirements make them prohibitive for cheap, off-the-shelf platforms. Overcoming those requirements is necessary for enabling widely-applicable smart devices. While many ways of making models smaller and more efficient have been developed, there is a lack of understanding of which ones are best suited for particular scenarios. More importantly for edge platforms, those choices cannot be analyzed in isolation from cost and user experience. In this work, we holistically explore how quantization, model scaling, and multi-modality interact with system components such as memory, sensors, and processors. We perform this hardware/software co-design from the cost, latency, and user-experience perspective, and develop a set of guidelines for optimal system design and model deployment for the most cost-constrained platforms. We demonstrate our approach using an end-to-end, on-device, biometric user authentication system using a $20 ESP-EYE board.

show abstract

Sparse and dense matrix multiplication hardware for heterogeneous multi-precision neural networks

Nunez-Yanez

Hosseinabady²

2021

Array

View full text Add to dashboard Cite

3PXNet

Cited by 7 publications

References 41 publications

Big–Little Adaptive Neural Networks on Low-Power Near-Subthreshold Processors

Big–Little Adaptive Neural Networks on Low-Power Near-Subthreshold Processors

Machine Learning-Based Predictive Model to Improve Cloud Application Performance in Cloud SaaS

Sparse and dense matrix multiplication hardware for heterogeneous multi-precision neural networks

Contact Info

Product

Resources

About