2019
DOI: 10.1109/tcsi.2019.2907488
|View full text |Cite
|
Sign up to set email alerts
|

Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays

Abstract: Deep neural networks are biologically-inspired class of algorithms that have recently demonstrated state-of-the-art accuracy in large scale classification and recognition tasks. Hardware acceleration of deep networks is of paramount importance to ensure their ubiquitous presence in future computing platforms. Indeed, a major landmark that enables efficient hardware accelerators for deep networks is the recent advances from the machine learning community that have demonstrated the viability of aggressively scal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
38
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 84 publications
(39 citation statements)
references
References 35 publications
1
38
0
Order By: Relevance
“…A majority of current hardware implementations are variants of von-Neumann machines [1]. In such machines, the memory and computation blocks are separate.…”
Section: Efficiency Analysismentioning
confidence: 99%
“…A majority of current hardware implementations are variants of von-Neumann machines [1]. In such machines, the memory and computation blocks are separate.…”
Section: Efficiency Analysismentioning
confidence: 99%
“…This work performs 4096 operations per cycle and achieves a throughput of 278.2 GOPS; the 6T bit-cell is susceptible to the write disturb. This problem can be addressed by adding more transistors or capacitors into the bit-cell, such as Xcel-RAM [20], XNOR-SRAM [22], and C3SRAM [23]. The Xcel-RAM approach [20] uses 10T bit-cell and performs binary MAC operation between the weight and input that are stored in two different rows.…”
Section: Introductionmentioning
confidence: 99%
“…In addition, BNNs replace expensive MACs with bitwise XNOR followed by population count (popcount) computations; XNOR followed by popcount computation is called as XNOR-and-accumulation (XAC). Thus, BNNs are known to be suitable for resource-and energy-constrained embedded systems compared to CNNs, by reducing the computational complexity as well as the memory footprint with minimal degradation in accuracy (less than 10% [2]).…”
Section: Introductionmentioning
confidence: 99%
“…Thanks to both (1) and (2), CiM_SRAM (M3D_4L) had smaller subarrays, which eventually reduced the length of the routing wires between subarrays.…”
mentioning
confidence: 99%