Traditional von Neumann computing systems involve separate processing and memory units. However, data movement is costly in terms of time and energy and this problem is aggravated by the recent explosive growth in highly data-centric applications related to artificial intelligence. This calls for a radical departure from the traditional systems and one such non-von Neumann computational approach is in-memory computing. Hereby certain computational tasks are performed in place in the memory itself by exploiting the physical attributes of the memory devices. Both charge-based and resistance-based memory devices are being explored for in-memory computing. In this Review, we provide a broad overview of the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing.
Artificial neuromorphic systems based on populations of spiking neurons are an indispensable tool in understanding the human brain and in constructing neuromimetic computational systems. To reach areal and power efficiencies comparable to those seen in biological systems, electroionics-based and phase-change-based memristive devices have been explored as nanoscale counterparts of synapses. However, progress on scalable realizations of neurons has so far been limited. Here, we show that chalcogenide-based phase-change materials can be used to create an artificial neuron in which the membrane potential is represented by the phase configuration of the nanoscale phase-change device. By exploiting the physics of reversible amorphous-to-crystal phase transitions, we show that the temporal integration of postsynaptic potentials can be achieved on a nanosecond timescale. Moreover, we show that this is inherently stochastic because of the melt-quench-induced reconfiguration of the atomic structure occurring when the neuron is reset. We demonstrate the use of these phase-change neurons, and their populations, in the detection of temporal correlations in parallel data streams and in sub-Nyquist representation of high-bandwidth signals.
Dense crossbar arrays of non-volatile memory (NVM) devices represent one possible path for implementing massively-parallel and highly energy-efficient neuromorphic computing systems. We first review recent advances in the application of NVM devices to three computing paradigms: spiking neural networks (SNNs), deep neural networks (DNNs), and 'Memcomputing'. In SNNs, NVM synaptic connections are updated by a local learning rule such as spike-timing-dependent-plasticity, a computational approach directly inspired by biology. For DNNs, NVM arrays can represent matrices of synaptic weights, implementing the matrix-vector multiplication needed for algorithms such as backpropagation in an analog yet massively-parallel fashion. This approach could provide significant improvements in power and speed compared to GPU-based DNN training, for applications of commercial significance. We then survey recent research in which different types of NVM devices-including phase change memory, conductive-bridging RAM, filamentary and nonfilamentary RRAM, and other NVMs-have been proposed, either as a synapse or as a neuron, for use within a neuromorphic computing application. The relevant virtues and limitations of these devices are assessed, in terms of properties such as conductance dynamic range, (non)linearity and (a)symmetry of conductance response, retention, endurance, required switching power, and device variability.
With the proliferation of ultra-high-speed mobile networks and internet-connected devices, along with the rise of artificial intelligence, the world is generating exponentially increasing amounts of data-data that needs to be processed in a fast, efficient and 'smart' way. These developments are pushing the limits of existing computing paradigms, and highly parallelized, fast and scalable hardware concepts are becoming progressively more important. Here, we demonstrate a computational specific integrated photonic tensor core-the optical analog of an ASIC-capable of operating at Tera-Multiply-Accumulate per second (TMAC/s) speeds. The photonic core achieves parallelized photonic inmemory computing using phase-change memory arrays and photonic chip-based optical frequency combs (soliton microcombs). The computation is reduced to measuring the optical transmission of reconfigurable and non-resonant, i.e. broadband, passive components operating at a bandwidth exceeding 14 GHz, limited only by the speed of the modulators and photodetectors. Given recent advances in hybrid integration of soliton microcombs at microwave line rates, ultra-low loss silicon nitride waveguides, and high speed on-chip detectors and modulators, our approach provides a path towards full CMOS wafer-scale integration of the photonic tensor core. While we focus on convolution processing, more generally our results indicate the major potential of integrated photonics for parallel, fast, efficient and wafer-scale manufacturable computational hardware in demanding AI applications such as autonomous driving, live video processing, and next generation cloud computing services.The increased demand for machine learning on very large datasets 1 and the growing offering of artificial intelligence services on the cloud 2-4 has driven a resurgence in custom hardware designed to accelerate multiply and accumulate (MAC) computations-the fundamental mathematical element needed for matrix-vector multiplication (MVM) operations. Whilst various custom silicon computing hardware (i.e. FPGAs 5 , ASICs 6 , and GPUs 7 ) have been developed to improve computational throughput and efficiency, they still depend on the same underlying electrical components which are fundamentally limited in both speed and energy by Joule heating, RF crosstalk, and capacitance 8 . The last of these (capacitance) dominates energy consumption and limits the maximum operating speeds in neural network hardware accelerators 9 since the movement of data (e.g. trained network weights), rather than arithmetic operations, requires the charging and discharging of chip-level metal interconnects. Thus, improving the efficiency of logic gates at the device level provides diminutive returns in such applications, if the flow of data during computation is not simultaneously addressed 10 . Even recent developments in the use of memristive crossbar arrays [11][12][13] to compute in the analog domain, whilst promising, do not have the potential for parallelizing the MVM operations (save for physically repli...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.