Technological and architectural improvements have been constantly required to sustain the demand of faster and cheaper computers. However, CMOS down-scaling is suffering from three technology walls: leakage wall, reliability wall, and cost wall. On top of that, a performance increase due to architectural improvements is also gradually saturating due to three well-known architecture walls: memory wall, power wall, and instruction-level parallelism (ILP) wall. Hence, a lot of research is focusing on proposing and developing new technologies and architectures. In this article, we present a comprehensive classification of memory-centric computing architectures; it is based on three metrics: computation location, level of parallelism, and used memory technology. The classification not only provides an overview of existing architectures with their pros and cons but also unifies the terminology that uniquely identifies these architectures and highlights the potential future architectures that can be further explored. Hence, it sets up a direction for future research in the field.
Von Neumann-based architectures suffer from costly communication between CPU and memory. This communication imposes several orders of magnitude more power and performance overheads compared to the arithmetic operations performed by the processor. This overhead becomes critical for applications that require processing a large amount of data. Computation-in-Memory (CIM) leveraging memristor devices in the crossbar structure offers a potential solution to tackle this challenge. However, support for the integer data type is lacking in CIM approaches as most solutions operate on a single/few bits only. This paper proposes a new organization of the periphery (next to memristor crossbar) to compute matrixmatrix multiplication (MMM) at the tile level. More precisely, the analog additions performed in the crossbar is complemented with additions performed in the digital periphery. In this mixed analog-digital system, digital additions are performed in a way that only the minimum size of adders are required-this is to reduce the latency of the digital periphery as much as possible. In addition, the design is customized to the number of ADCs as well as datatype sizes to support different possible scenarios. The results show that our organization reduces energy and latency up to 50× and 3×, respectively, compared to the reference design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.