Compute Caches

Aga, Shaizeen; Jeloka, Supreet; Subramaniyan, Arun; Narayanasamy, Satish; Blaauw, David; Das, Reetuparna

doi:10.1109/hpca.2017.21

Cited by 264 publications

(233 citation statements)

References 22 publications

Supporting

Mentioning

229

Contrasting

Order By: Relevance

“…An obvious limitation to the approach is that it does not offer a systematic way to locate all the possible places in which CiM-enabled macros could be inserted, which inevitably underestimate the benefits of CiM. Different from most current works, [22] explores CiM in three levels of SRAM cache hierarchy and completes the control flow inside cache in the absence of data locality. However, its limitation is the same as [19], which requires customized benchmarks for data locality.…”

Section: Related Workmentioning

confidence: 99%

“…Recent works (e.g., [4,15,16,17,18,19,20,21,22,23,24,25]) in both CMOS Static Random Access Memory (SRAM) and emerging non-volatile memories (NVMs) have demonstrated various CiM designs at different levels of memory hierarchy. The designs allow computation to occur exactly where data resides, thereby reducing energy and performance overheads associated with data movement.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Eva-CiM: A System-Level Performance and Energy Evaluation Framework for Computing-in-Memory Architectures

Gao

Reis

et al. 2020

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

View full text Add to dashboard Cite

Computing-in-Memory (CiM) architectures aim to reduce costly data transfers by performing arithmetic and logic operations in memory and hence relieve the pressure due to the memory wall. However, determining whether a given workload can really benefit from CiM, which memory hierarchy and what device technology should be adopted by a CiM architecture requires in-depth study that is not only time consuming but also demands significant expertise in architectures and compilers. This paper presents an energy evaluation framework, Eva-CiM, for systems based on CiM architectures. Eva-CiM encompasses a multi-level (from device to architecture) comprehensive tool chain by leveraging existing modeling and simulation tools such as GEM5 [1], McPAT [2] and DESTINY [3]. To support high-confidence prediction, rapid design space exploration and ease of use, Eva-CiM introduces several novel modeling/analysis approaches including models for capturing memory access and dependencyaware ISA traces, and for quantifying interactions between the host CPU and CiM modules. Eva-CiM can readily produce energy estimates of the entire system for a given program, a processor architecture, and the CiM array and technology specifications. Eva-CiM is validated by comparing with DESTINY [3] and [4], and enables findings including practical contributions from CiM-supported accesses, CiM-sensitive benchmarking as well as the pros and cons of increased memory size for CiM. Eva-CiM also enables exploration over different configurations and device technologies, showing 1.3-6.0× energy improvement for SRAM and 2.0-7.9× for FeFET-RAM, respectively.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Eva-CiM: A System-Level Performance and Energy Evaluation Framework for Computing-in-Memory Architectures

Gao

Reis

et al. 2020

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

View full text Add to dashboard Cite

show abstract

“…While the hardware designs in [10], [19], [24] are specialized to carry out 1-bit MVPs and the designs in [6], [23] to execute multi-bit MVPs for neural network inference, PPAC is programmable to perform not only these operations, but also GF(2) MVPs, Hamming-similarity computations, and PLA or CAM functionality, opening up its use in a wide range of applications. In this sense, PPAC is similar to the work in [3], where PIM is used to accelerate multiple applications, such as database query processing, cryptographic kernels, and in-memory checkpointing. A fair comparison to [3] is, however, difficult as it considers a complete system-PPAC would need to be integrated into a system for a fair comparison.…”

Section: B Comparison With Existing Acceleratorsmentioning

confidence: 99%

“…In this sense, PPAC is similar to the work in [3], where PIM is used to accelerate multiple applications, such as database query processing, cryptographic kernels, and in-memory checkpointing. A fair comparison to [3] is, however, difficult as it considers a complete system-PPAC would need to be integrated into a system for a fair comparison. We note, however, that if the method in [3] is used to compute MVPs, an element-wise multiplication between two vectors whose entries are L-bit requires L 2 + 5L − 2 clock cycles [4], which is a total of 34 clock cycles for 4-bit numbers.…”

Section: B Comparison With Existing Acceleratorsmentioning

confidence: 99%

PPAC: A Versatile In-Memory Accelerator for Matrix-Vector-Product-Like Operations

Castañeda

Bobbett

Gallyas-Sanhueza

et al. 2019

2019 IEEE 30th International Conference on Application-Specific Systems, Architectures and Processors (ASAP)

View full text Add to dashboard Cite

Processing in memory (PIM) moves computation into memories with the goal of improving throughput and energy-efficiency compared to traditional von Neumann-based architectures. Most existing PIM architectures are either generalpurpose but only support atomistic operations, or are specialized to accelerate a single task. We propose the Parallel Processor in Associative Content-addressable memory (PPAC), a novel in-memory accelerator that supports a range of matrix-vectorproduct (MVP)-like operations that find use in traditional and emerging applications. PPAC is, for example, able to accelerate low-precision neural networks, exact/approximate hash lookups, cryptography, and forward error correction. The fully-digital nature of PPAC enables its implementation with standard-cellbased CMOS, which facilitates automated design and portability among technology nodes. To demonstrate the efficacy of PPAC, we provide post-layout implementation results in 28nm CMOS for different array sizes. A comparison with recent digital and mixed-signal PIM accelerators reveals that PPAC is competitive in terms of throughput and energy-efficiency, while accelerating a wide range of applications and simplifying development.

show abstract

“…Some recent research works start to explore and evaluate the performance of this concept. It has been applied both on volatile memories [7] [8] [9] and non volatile memories [10] [11].…”

Section: Related Work a In-memory Computingmentioning

confidence: 99%

Smart instruction codes for in-memory computing architectures compatible with standard SRAM interfaces

Kooli

Charles

Touzet

et al. 2018

2018 Design, Automation &Amp; Test in Europe Conference &Amp; Exhibition (DATE)

View full text Add to dashboard Cite

Abstract-This paper presents the computing model for InMemory Computing architecture based on SRAM memory that embeds computing abilities. This memory concept offers significant performance gains in terms of energy consumption and execution time. To handle the interaction between the memory and the CPU, new memory instruction codes were designed. These instructions are communicated by the CPU to the memory, using standard SRAM buses. This implementation allows (1) to embed In-Memory Computing capabilities on a system without Instruction Set Architecture (ISA) modification, and (2) to finely interlace CPU instructions and in-memory computing instructions.

show abstract

Compute Caches

Cited by 264 publications

References 22 publications

Eva-CiM: A System-Level Performance and Energy Evaluation Framework for Computing-in-Memory Architectures

Eva-CiM: A System-Level Performance and Energy Evaluation Framework for Computing-in-Memory Architectures

PPAC: A Versatile In-Memory Accelerator for Matrix-Vector-Product-Like Operations

Smart instruction codes for in-memory computing architectures compatible with standard SRAM interfaces

Contact Info

Product

Resources

About