Comparison of GPU- and CPU-implementations of mean-firing rate neural networks on parallel hardware

Dinkelbach, Helge Ülo; Vitay, Julien; Beuth, Frederik; Hamker, Fred H.

doi:10.3109/0954898x.2012.739292

Cited by 20 publications

(25 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On graphical cards with CUDA, the connectivity is stored in the compressed sparse row (CSR) format, where the values of each attribute are flattened into a single vector and a list of row pointers allow to attribute portions of this array to a single post-synaptic neuron (see Brette and Goodman, 2011 , for a review). These different data structures lead to a better parallel performance: CSR representations ensure a coalesced access to the attributes (i.e., the data is contiguous in memory), which is a strong condition for GPU computations to be efficient (Brette and Goodman, 2012 ), while the LIL structure allows a faster distribution of the data to the different OpenMP threads (Dinkelbach et al, 2012 ). LIL and CSR representations have similar memory requirements, but LIL is more adapted to the dynamical addition or suppression of synapses: structural plasticity is very inefficient on the GPU platform and is currently disabled.…”

Section: Code Generationmentioning

confidence: 99%

“…The weighted sum of inputs is for example executed in parallel over blocks of post-synaptic neurons with OpenMP. In contrast, parallel reduction is used in the CUDA implementation, as it leads to better performance (Dinkelbach et al, 2012 ). The main advantage of this code generation approach is that only the required steps are generated: spike-only mechanisms are skipped for rate-coded networks, as well as mechanisms for synaptic delays or structural plasticity if the network does not define them.…”

Section: Code Generationmentioning

confidence: 99%

“…Each neuron is a simple leaky-integrator of excitatory inputs with a firing rate defined by the ODE , being a global parameter of the population. Unlike spiking networks, the simulation time of a rate-coded network does not depend on the activity in the network and the summation of inputs for all-to-all connectivity patterns hugely overcomes the update of neural variables (Dinkelbach et al, 2012 ), so such a simple network is sufficient to exhibit the parallel performance of the simulation. As outlined in the introduction, we are not aware of parallel simulators of rate-coded networks which could simply implement this network, so we only present in Figure 7 the speed-up ratio of the simulation time when using 1–12 threads with OpenMP or when using CUDA as the simulation backend.…”

Section: Benchmarksmentioning

confidence: 99%

“…It uses a C++ code generation approach to perform the simulation in order to avoid the costs of an interpreted language such as Python. Furthermore, rate-coded and spiking networks raise different problems for parallelization (Dinkelbach et al, 2012 ), so code generation ensures the required computations are adapted to the parallel framework. ANNarchy is released under the version 2 of the GNU Public License.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

ANNarchy: a code generation approach to neural simulations on parallel hardware

Vitay

Dinkelbach

Hamker

2015

Front. Neuroinform.

Self Cite

View full text Add to dashboard Cite

Many modern neural simulators focus on the simulation of networks of spiking neurons on parallel hardware. Another important framework in computational neuroscience, rate-coded neural networks, is mostly difficult or impossible to implement using these simulators. We present here the ANNarchy (Artificial Neural Networks architect) neural simulator, which allows to easily define and simulate rate-coded and spiking networks, as well as combinations of both. The interface in Python has been designed to be close to the PyNN interface, while the definition of neuron and synapse models can be specified using an equation-oriented mathematical description similar to the Brian neural simulator. This information is used to generate C++ code that will efficiently perform the simulation on the chosen parallel hardware (multi-core system or graphical processing unit). Several numerical methods are available to transform ordinary differential equations into an efficient C++code. We compare the parallel performance of the simulator to existing solutions.

show abstract

Section: Code Generationmentioning

confidence: 99%

Section: Code Generationmentioning

confidence: 99%

Section: Benchmarksmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

ANNarchy: a code generation approach to neural simulations on parallel hardware

Vitay

Dinkelbach

Hamker

2015

Front. Neuroinform.

Self Cite

View full text Add to dashboard Cite

show abstract

“…The efficiency of GPUs in accelerating simulations in the field of bioinformatics has already been recognized, [54]. The suitability of the network topology to the hardware appears to be a relevant issue which is discussed in [55], along with additional important parameters such as the network size and its connectivity, memory alignment and floating precision. Deep unsupervised learning in large-scale networks is presented in [56], evidencing a clear benefit of GPUs over CPUs.…”

Section: Architecture-level Realizationsmentioning

confidence: 99%

Neuromorphic microelectronics from devices to hardware systems and applications

Schmid

2016

NOLTA

View full text Add to dashboard Cite

Neuromorphic systems aiming at mimicking some characteristics of the nervous systems of living humans or animals have been developed since the late 1980s', taking benefit of intrinsic properties and increasing performances of the successive silicon fabrication technologies. A regain of interest has been observed in the middle of the 2010s', which manifests itself from the emergence of large-scale projects integrating various computational and hardware perspectives, by the increased interest and involvement of industry and the growth of the volume of scientific publications. This paper reviews research directions and methods of neuromorphic microelectronics hardware, the developed hardware and its performance, and discusses current issues and potential future developments.

show abstract