Conversion of Synchronous Artificial Neural Network to Asynchronous Spiking Neural Network using sigma-delta quantization

Yousefzadeh, Amirreza; Hosseini, Sahar; Holanda, Priscila; Leroux, Sam; Werner, Thilo; Serrano-Gotarredona, Teresa; Linares-Barranco, B.; Dhoedt, Bart; Simoens, Pieter

doi:10.1109/aicas.2019.8771624

Cited by 24 publications

(19 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our goal was to optimize the number of operations for spatio-temporal sparsity while performing asynchronous inference. This paper is a continuation of our previous brief paper [36]. In the present paper, in addition to more detailed explanations, we extend our algorithm to use "valued spikes" which results in a smaller number of required events.…”

Section: * Amirreza Yousefzadeh and Mina A Khoei Contributed Equally mentioning

confidence: 81%

Asynchronous Spiking Neurons, the Natural Key to Exploit Temporal Sparsity

Yousefzadeh

Serrano-Gotarredona

Linares-Barranco

et al. 2019

IEEE J. Emerg. Sel. Topics Circuits Syst.

Self Cite

View full text Add to dashboard Cite

Inference of Deep Neural Networks for stream signal (Video/Audio) processing in edge devices is still challenging. Unlike the most state of the art inference engines which are efficient for static signals, our brain is optimized for real-time dynamic signal processing. We believe one important feature of the brain (asynchronous state-full processing) is the key to its excellence in this domain. In this work, we show how asynchronous processing with state-full neurons allows exploitation of the existing sparsity in natural signals. This paper explains three different types of sparsity and proposes an inference algorithm which exploits all types of sparsities in the execution of already trained networks. Our experiments in three different applications (Handwritten digit recognition, Autonomous Steering and Hand-Gesture recognition) show that this model of inference reduces the number of required operations for sparse input data by a factor of one to two orders of magnitudes. Additionally, due to fully asynchronous processing this type of inference can be run on fully distributed and scalable neuromorphic hardware platforms.

show abstract

Section: * Amirreza Yousefzadeh and Mina A Khoei Contributed Equally mentioning

confidence: 81%

Asynchronous Spiking Neurons, the Natural Key to Exploit Temporal Sparsity

Yousefzadeh

Serrano-Gotarredona

Linares-Barranco

et al. 2019

IEEE J. Emerg. Sel. Topics Circuits Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…On the contrary, using this low-precision quantization does not harm to the final accuracy, but enables a cheap memory budget on many popular neuromorphic systems such as Akopyan et al ( 2015 ), Davies et al ( 2018 ), and Kuang et al ( 2021 ). More specially, our networks complete simulation for one input sample within only one time step, compared with other conversion methods with dozens even hundreds of simulation time steps (Lee et al, 2016 , 2020 ; Bodo et al, 2017 ; Mostafa et al, 2017 ; Xu et al, 2017 ; Rueckauer and Liu, 2018 ; Wu et al, 2018 ; Yousefzadeh et al, 2019 ).…”

Section: Methodsmentioning

confidence: 99%

“…The MNIST dataset (Lecun and Bottou, 1998) of handwritten digit has been widely applied in image classification field, In our experiments, we use a ternary-valued {-1,0,1} weight quantization as in Li and Liu (2016), not full precision (16 or 32 bits) like many others (Lee et al, 2016(Lee et al, , 2020Bodo et al, 2017;Mostafa et al, 2017;Rueckauer and Liu, 2018;Wu et al, 2018;Yousefzadeh et al, 2019), to facilitate hardware deployment, because we find the weight quantization with more bit-width contributes very little to final accuracy, which is consistent with (Rastegari et al, 2016;Zhou et al, 2016). All convolutional networks are trained using standard ADAM rule (Kingma and Ba, 2014) with an initial learning rate set to 0.001 and 10 times decayed per 200 epochs, based on TensorLayer (Dong et al, 2017), a customized deep learning library.…”

Section: Mnist Datasetmentioning

confidence: 99%

See 1 more Smart Citation

A Scatter-and-Gather Spiking Convolutional Neural Network on a Reconfigurable Neuromorphic Hardware

et al. 2021

View full text Add to dashboard Cite

Artificial neural networks (ANNs), like convolutional neural networks (CNNs), have achieved the state-of-the-art results for many machine learning tasks. However, inference with large-scale full-precision CNNs must cause substantial energy consumption and memory occupation, which seriously hinders their deployment on mobile and embedded systems. Highly inspired from biological brain, spiking neural networks (SNNs) are emerging as new solutions because of natural superiority in brain-like learning and great energy efficiency with event-driven communication and computation. Nevertheless, training a deep SNN remains a main challenge and there is usually a big accuracy gap between ANNs and SNNs. In this paper, we introduce a hardware-friendly conversion algorithm called “scatter-and-gather” to convert quantized ANNs to lossless SNNs, where neurons are connected with ternary {−1,0,1} synaptic weights. Each spiking neuron is stateless and more like original McCulloch and Pitts model, because it fires at most one spike and need be reset at each time step. Furthermore, we develop an incremental mapping framework to demonstrate efficient network deployments on a reconfigurable neuromorphic chip. Experimental results show our spiking LeNet on MNIST and VGG-Net on CIFAR-10 datasetobtain 99.37% and 91.91% classification accuracy, respectively. Besides, the presented mapping algorithm manages network deployment on our neuromorphic chip with maximum resource efficiency and excellent flexibility. Our four-spike LeNet and VGG-Net on chip can achieve respective real-time inference speed of 0.38 ms/image, 3.24 ms/image, and an average power consumption of 0.28 mJ/image and 2.3 mJ/image at 0.9 V, 252 MHz, which is nearly two orders of magnitude more efficient than traditional GPUs.

show abstract

“…Sigma-delta encoding with discretized deltas has been shown to result in a significant reduction in operation count [18]. Other work has expanded upon this by enabling conversion of regular, pre-trained neural networks to spiking neural networks [7,21]. The addition of thresholding logic to suppress propagation of small deltas reduces computation counts even further [11].…”

Section: Related Workmentioning

confidence: 99%

How to exploit sparsity in RNNs on event-driven architectures

Brils

Waeijen²,

Pourtaherian³

2021

Proceedings of the 24th International Workshop on Software and Compilers for Embedded Systems

View full text Add to dashboard Cite

Event-driven architectures have been shown to provide low-power, low-latency artificial neural network (ANN) inference. This is especially beneficial on Edge devices, particularly when combined with sparse execution. Recurrent neural networks (RNNs) are ANNs that emulate memory. Their recurrent connection enables the reuse of previous output for the generation of new output. However, when trying to use RNNs in a sparse context on event-driven architectures, novel challenges in synchronization and the usage of sparse data are encountered. In this work, these challenges are systematically analyzed, and mechanisms to overcome them are proposed. Experimental results of a monocular depth estimation use case on the NeuronFlow architecture show that sparsity in RNNs can be exploited effectively on event-driven architectures. CCS CONCEPTS• Hardware → Asynchronous circuits; • Computer systems organization → Embedded hardware; Embedded software; Neural networks.

show abstract

Conversion of Synchronous Artificial Neural Network to Asynchronous Spiking Neural Network using sigma-delta quantization

Cited by 24 publications

References 23 publications

Asynchronous Spiking Neurons, the Natural Key to Exploit Temporal Sparsity

Asynchronous Spiking Neurons, the Natural Key to Exploit Temporal Sparsity

A Scatter-and-Gather Spiking Convolutional Neural Network on a Reconfigurable Neuromorphic Hardware

How to exploit sparsity in RNNs on event-driven architectures

Contact Info

Product

Resources

About