A reconfigurable analog substrate for highly efficient maximum flow computation

Liu, Gai; Zhang, Zhiru

doi:10.1145/2744769.2744781

Cited by 6 publications

(1 citation statement)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The neuromodulatory processes in the brain that leverage spikes to promote learning remain somewhat shrouded in mystery, which has inspired the development of several research-based neuromorphic processors. Several examples include Loihi developed by Intel Labs [32,33], IBM's TrueNorth [34,35], Neurogrid from Stanford University [36], SpiNNaker initiated at the University of Manchester [37,38], National University of Singapore's Shenjing [39], and memristor based accelerators like RENO [40], Harmonica [41], MNSIM [42], some of which have roused neuromorphic research ecosystems where hardware access is offered both remotely and physically to the broader research community. While such neuromorphic processors remain to be optimized for gradient-based learning, they have incited much interest in how neurobiological processes can be modelled in-silico.…”

Section: Neuromorphic Processorsmentioning

confidence: 99%

Exploiting deep learning accelerators for neuromorphic workloads

Sun,

Titterton,

Gopiani

et al. 2024

Neuromorph. Comput. Eng.

View full text Add to dashboard Cite

Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency when performing inference with deep learning workloads. Error backpropagation is presently regarded as the most effective method for training SNNs, but in a twist of irony, when training on modern graphics processing units (GPUs) this becomes more expensive than non-spiking networks. The emergence of Graphcore's Intelligence Processing Units (IPUs) balances the parallelized nature of deep learning workloads with the sequential, reusable, and sparsified nature of operations prevalent when training SNNs. IPUs adopt multi-instruction multi-data (MIMD) parallelism by running individual processing threads on smaller data blocks, which is a natural fit for the sequential, non-vectorized steps required to solve spiking neuron dynamical state equations. We present an IPU-optimized release of our custom SNN Python package, snnTorch, which exploits fine-grained parallelism by utilizing low-level, pre-compiled custom operations to accelerate irregular and sparse data access patterns that are characteristic of training SNN workloads. We provide a rigorous performance assessment across a suite of commonly used spiking neuron models, and propose methods to further reduce training run-time via half-precision training. By amortizing the cost of sequential processing into vectorizable population codes, we ultimately demonstrate the potential for integrating domain-specific accelerators with the next generation of neural networks.

show abstract