SNNIM: A 10T-SRAM based Spiking-Neural-Network-In-Memory architecture with capacitance computation

Wang, Bo; Xue, Chen; Liu, Han; Li, Xiang; Yin, Anran; Feng, Zhongyuan; Kong, Yuyao; Xiong, Tianzhu; Hsu, Haiming; Zhou, Yongliang; Guo, An; Wang, Yufei; Yang, Jun; Si, Xin

doi:10.1109/iscas48785.2022.9937272

Cited by 6 publications

(1 citation statement)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2: D i refers to the parallelism of the input activation vector, while D o refers to the parallelism of the output vector generated after each MVM operation. Each element along D o is further split into separate cells based on the weight bitwidth B w ; similarly, each element along the D i dimension corresponds to M cells, both in AIMC [3], [27], [28] and DIMC [6]- [8]. Moreover, the MVM operation can be temporally split in multiple cycles, in which B cycle bits are processed per cycle per input.…”

Section: Unified Analytical Imc Performance Modelmentioning

confidence: 99%

Analog or Digital In-Memory Computing? Benchmarking Through Quantitative Modeling

Sun,

Houshmand,

Verhelst

2023

2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)

View full text Add to dashboard Cite

In-Memory Computing (IMC) has emerged as a promising paradigm for energy-efficient, throughput-efficient and area-efficient machine learning at the edge. However, the differences in hardware architectures, array dimensions, and fabrication technologies among published IMC realizations have made it difficult to grasp their relative strengths. Moreover, previous studies have primarily focused on exploring and benchmarking the peak performance of a single IMC macro rather than full system performance on real workloads. This paper aims to address the lack of a quantitative comparison of Analog In-Memory Computing (AIMC) and Digital In-Memory Computing (DIMC) processor architectures. We propose an analytical IMC performance model that is validated against published implementations and integrated into a system-level exploration framework for comprehensive performance assessments on different workloads with varying IMC configurations. Our experiments show that while DIMC generally has higher computational density than AIMC, AIMC with large macro sizes may have better energy efficiency than DIMC on convolutional-layers and pointwise-layers, which can exploit high spatial unrolling. On the other hand, DIMC with small macro size outperforms AIMC on depthwiselayers, which feature limited spatial unrolling opportunities inside a macro.Index Terms-Machine learning, quantitative modeling, analog in-memory computing, digital in-memory computing I. INTRODUCTIONRecent developments of ultra-low power machine learning models have enabled the deployment of artificial intelligence on extreme edge devices. However typical embedded digital accelerators suffer from high data movement costs and low computational densities, degrading the energy efficiencies by up to 2×-1000× with respect to the ideal baseline of digital computations. To minimize the data transfer overhead, inmemory computing (IMC) has recently emerged as a promising alternative to conventional accelerators based on arrays of digital processing elements (PEs). By directly performing the operations near/in the memory cells, these architectures allow to greatly reduce access overheads and enable massive parallelization opportunities, with potential orders of magnitude improvements in energy efficiency and throughput [1]. Most of the initial IMC designs published in the literature are focused on analog IMC (AIMC), where the computation is carried out in the analog domain [2]-[4]. While this approach ensures extreme energy efficiencies and massive parallelization, the analog nature of the computation and the presence O

show abstract

Section: Unified Analytical Imc Performance Modelmentioning

confidence: 99%

Analog or Digital In-Memory Computing? Benchmarking Through Quantitative Modeling

Sun,

Houshmand,

Verhelst

2023

2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)

View full text Add to dashboard Cite

show abstract

A Quantized-Weight-Splitting Method of RRAM Arrays for Neuromorphic Applications

Park,

Kim,

Park

et al. 2024

IEEE Access

View full text Add to dashboard Cite

Indigenous back-end-of-line compatible SiO₂-based One-Time Programmable Memory for Secured Spiking Neural Network Inference Accelerator

Deshmukh,

Biswas,

Kadam

et al. 2024

2024 8th IEEE Electron Devices Technology &Amp;amp; Manufacturing Conference (EDTM)

View full text Add to dashboard Cite

SNNIM: A 10T-SRAM based Spiking-Neural-Network-In-Memory architecture with capacitance computation

Cited by 6 publications

References 8 publications

Analog or Digital In-Memory Computing? Benchmarking Through Quantitative Modeling

Analog or Digital In-Memory Computing? Benchmarking Through Quantitative Modeling

A Quantized-Weight-Splitting Method of RRAM Arrays for Neuromorphic Applications

Indigenous back-end-of-line compatible SiO₂-based One-Time Programmable Memory for Secured Spiking Neural Network Inference Accelerator

Contact Info

Product

Resources

About

SNNIM: A 10T-SRAM based Spiking-Neural-Network-In-Memory architecture with capacitance computation

Cited by 6 publications

References 8 publications

Analog or Digital In-Memory Computing? Benchmarking Through Quantitative Modeling

Analog or Digital In-Memory Computing? Benchmarking Through Quantitative Modeling

A Quantized-Weight-Splitting Method of RRAM Arrays for Neuromorphic Applications

Indigenous back-end-of-line compatible SiO2-based One-Time Programmable Memory for Secured Spiking Neural Network Inference Accelerator

Contact Info

Product

Resources

About

Indigenous back-end-of-line compatible SiO₂-based One-Time Programmable Memory for Secured Spiking Neural Network Inference Accelerator