Alberto Delmás Lascorz scite author profile

Alberto Delmás Lascorz

5Publications

87Citation Statements Received

52Citation Statements Given

How they've been cited

174

How they cite others

Affiliations

University of Toronto

Publications

Order By: Most citations

Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks

Sharify¹,

Lascorz²,

Judd³

et al. 2018

View full text Add to dashboard Cite

Loom (LM), a hardware inference accelerator for Convolutional Neural Networks (CNNs) is presented. In LM every bit of data precision that can be saved translates to proportional performance gains. Specifically, for convolutional layers LM's execution time scales inversely proportionally with the precisions of both weights and activations. For fully-connected layers LM's performance scales inversely proportionally with the precision of the weights. LM targets area-and bandwidth-constrained System-on-a-Chip designs such as those found on mobile devices that cannot afford the multimegabyte buffers that would be needed to store each layer on-chip. Accordingly, given a data bandwidth budget, LM boosts energy efficiency and performance over an equivalent bit-parallel accelerator. For both weights and activations LM can exploit profilederived per layer precisions. However, at runtime LM further trims activation precisions at a much smaller than a layer granularity. Moreover, it can naturally exploit weight precision variability at a smaller granularity than a layer. On average, across several image classification CNNs and for a configuration that can perform the equivalent of 128 16b × 16b multiply-accumulate operations per cycle LM outperforms a state-of-the-art bit-parallel accelerator [1] by 4.38× without any loss in accuracy while being 3.54× more energy efficient. LM can trade-off accuracy for additional improvements in execution performance and energy efficiency and compares favorably to an accelerator that targeted only activation precisions. We also study 2-and 4-bit LLM variants and find the the 2-bit per cycle variant is the most energy efficient.

show abstract

Laconic deep learning inference acceleration

Sharify

Lascorz

Mahmoud

et al. 2019

View full text Add to dashboard Cite

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Nikolić¹,

Hacene²,

Bannon³

et al. 2020

Preprint

View full text Add to dashboard Cite

Neural networks have demonstrably achieved stateof-the art accuracy using low-bitlength integer quantization, yielding both execution time and energy benefits on existing hardware designs that support short bitlengths. However, the question of finding the minimum bitlength for a desired accuracy remains open. We introduce a training method for minimizing inference bitlength at any granularity while maintaining accuracy. Furthermore, we propose a regularizer that penalizes large bitlength representations throughout the architecture and show how it can be modified to minimize other quantifiable criteria, such as number of operations or memory footprint. We demonstrate that our method learns thrifty representations while maintaining accuracy. With ImageNet, the method produces an average per layer bitlength of 4.13 and 3.76 bits on AlexNet and ResNet18 respectively, remaining within 2.0% and 0.5% of the baseline TOP-1 accuracy.

show abstract

Late Breaking Results: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick

Vivancos

Sharify

Nikolić

et al. 2020

View full text Add to dashboard Cite

Value-Based Deep-Learning Acceleration

et al. 2018

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.