Computing-In-Memory (CIM) using Flash memory is a potential solution to support a heavy-weight DNN inference accelerator for edge computing applications. Flash memory provides the best high-density and low-cost non-volatile memory solution to store the weights, while CIM functions of Flash memory can compute AI neural network calculations inside the memory chip. Our analysis indicates that Flash CIM can save data movements by ~85% as compared with the conventional Von-Neumann architecture. In this work, we propose a detail device and design co-optimizations to realize Flash CIM, using a novel vertical split-gate Flash device. Our device supports low-voltage (<1V) read at WL's and BL's, tight and tunable cell current (Icell) ranging from 150nA to 1.5uA, extremely large Icell ON/OFF ratio ~ 7 orders, small RTN noise and negligible read disturb to provide a high-performance and highly-reliable CIM solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.