Hierarchical Memory System With STT-MRAM and SRAM to Support Transfer and Real-Time Reinforcement Learning in Autonomous Drones

Yoon, Insik; Anwar, Malik Aqeel; Joshi, Rajiv V.; Rakshit, Titash; Raychowdhury, Arijit

doi:10.1109/jetcas.2019.2932285

Cited by 15 publications

(9 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Anwar et al [49] evaluate the robustness of swarm robotic systems under adversaries. Yoon et al [50] present a novel hierarchically memory system with STT-MRAM and SRAM to support realtime learning-based robotic exploration. We believe a holistic benchmarking and simulator infrastructure will uncover more cross-layer research findings of various fields of edge robotics.…”

Section: Benchmarking and Software Infrastructurementioning

confidence: 99%

Circuit and System Technologies for Energy-Efficient Edge Robotics: (Invited Paper)

Wan

Lele

Raychowdhury

2022

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)

Self Cite

View full text Add to dashboard Cite

As we march towards the age of ubiquitous intelligence, we note that AI and intelligence are progressively moving from the cloud to the edge. The success of Edge-AI is pivoted on innovative circuits and hardware that can enable inference and limited learning in resource-constrained edge autonomous systems. This paper introduces a series of ultra-low-power accelerator and system designs on enabling the intelligence in edge robotic platforms, including reinforcement learning neuromorphic control, swarm intelligence, and simultaneous mapping and localization. We put an emphasis on the impact of the mixedsignal circuit, neuro-inspired computing system, benchmarking and software infrastructure, as well as algorithm-hardware codesign to realize the most energy-efficient Edge-AI ASICs for the next-generation intelligent and autonomous systems.

show abstract

Section: Benchmarking and Software Infrastructurementioning

confidence: 99%

Circuit and System Technologies for Energy-Efficient Edge Robotics: (Invited Paper)

Wan

Lele

Raychowdhury

2022

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)

Self Cite

View full text Add to dashboard Cite

show abstract

“…With respect to the circuit-level implementation of storage blocks, memory subsystems can perform the storage function as well as the associated arithmetic and computing units. IMC and NMC were investigated, and the majority of them underwent silicon verification in SRAM [4,[18][19][20][21][22] and several NVMs, including RRAM [23][24][25], STT-MRAM [26][27][28][29][30][31], spin-orbit torque (SOT), and MRAM [32][33][34].…”

Section: Circuit-level Implementationmentioning

confidence: 99%

“…29,34,56,58,59,75,[90][91][92][93] lists the recent neural network implementations. Several in-MRAM computing studies were implemented and verified using a 2x nm CMOS process.…”

mentioning

confidence: 99%

A survey of in-spin transfer torque MRAM computing

Cai

Liu

Chen

et al. 2021

Sci. China Inf. Sci.

View full text Add to dashboard Cite

In traditional von Neumann computing architectures, the essential transfer of data between the processor and memory hierarchies limits the computational efficiency of next-generation system-on-a-chip. The emerging in-memory computing (IMC) approach addresses this issue and facilitates the movement of significant data and rapid computations. Among the different memory types, intrinsic energy efficiency is demonstrated by in-magnetic random access memory (MRAM) computing with a low-power spintronic magnetic tunnel junction device and hybrid integration at an advanced complementary metal-oxide semiconductor node. This study reviews state-of-the-art techniques for managing IMC with an emphasis on spin-transfer torque-MRAM computing via design schemes at the bit-cell, circuit, and system levels. In addition, this study presents effective design techniques and potential challenges and demonstrates the existing limitations of in-MRAM computing and potential methods for overcoming these issues. This study also considers the design technology co-optimization from the IMC perspective.

show abstract

“…In [37] a hybrid of SRAM and 3D-stacked STT-MRAM based AI accelerator was proposed for real-time learning where eMRAM acted as weight storage memory for infrequently accessed and updated layers, such as all convolutional layers and first few fully connected layers for a Transfer Learning followed by Reinforcement Learning algorithm. However, due to the use of typical slow and write-power-hungry STT-MRAM, this study could not completely exploit STT-MRAM to substitute SRAM and eventually used SRAM for storing weights of the last few fully connected layers which are accessed and updated frequently in transfer learning-based reinforcement learning setting.…”

Section: G Accelerator Performance With Imagenet Datasetmentioning

confidence: 99%

Designing Efficient and High-Performance AI Accelerators With Customized STT-MRAM

Mishty

Sadi

2021

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

In this paper, we demonstrate the design of efficient and high-performance AI/Deep Learning accelerators with customized STT-MRAM and a reconfigurable core. Based on modeldriven detailed design space exploration, we present the design methodology of an innovative scratchpad-assisted on-chip STT-MRAM based buffer system for high-performance accelerators. Using analytically derived expression of memory occupancy time of AI model weights and activation maps, the volatility of STT-MRAM is adjusted with process and temperature variation aware scaling of thermal stability factor to optimize the retention time, energy, read/write latency, and area of STT-MRAM. From the analysis of modern AI workloads and accelerator implementation in 14nm technology, we verify the efficacy of our designed AI accelerator with STT-MRAM (STT-AI). Compared to an SRAMbased implementation, the STT-AI accelerator achieves 75% area and 3% power savings at iso-accuracy. Furthermore, with a relaxed bit error rate and negligible AI accuracy trade-off, the designed STT-AI Ultra accelerator achieves 75.4%, and 3.5% savings in area and power, respectively over regular SRAM-based accelerators.

show abstract

Hierarchical Memory System With STT-MRAM and SRAM to Support Transfer and Real-Time Reinforcement Learning in Autonomous Drones

Cited by 15 publications

References 33 publications

Circuit and System Technologies for Energy-Efficient Edge Robotics: (Invited Paper)

Circuit and System Technologies for Energy-Efficient Edge Robotics: (Invited Paper)

A survey of in-spin transfer torque MRAM computing

Designing Efficient and High-Performance AI Accelerators With Customized STT-MRAM

Contact Info

Product

Resources

About