2012
DOI: 10.3109/0954898x.2012.739292
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of GPU- and CPU-implementations of mean-firing rate neural networks on parallel hardware

Abstract: Modern parallel hardware such as multi-core processors (CPUs) and graphics processing units (GPUs) have a high computational power which can be greatly beneficial to the simulation of large-scale neural networks. Over the past years, a number of efforts have focused on developing parallel algorithms and simulators best suited for the simulation of spiking neural models. In this article, we aim at investigating the advantages and drawbacks of the CPU and GPU parallelization of mean-firing rate neurons, widely u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
24
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 20 publications
(25 citation statements)
references
References 27 publications
1
24
0
Order By: Relevance
“…On graphical cards with CUDA, the connectivity is stored in the compressed sparse row (CSR) format, where the values of each attribute are flattened into a single vector and a list of row pointers allow to attribute portions of this array to a single post-synaptic neuron (see Brette and Goodman, 2011 , for a review). These different data structures lead to a better parallel performance: CSR representations ensure a coalesced access to the attributes (i.e., the data is contiguous in memory), which is a strong condition for GPU computations to be efficient (Brette and Goodman, 2012 ), while the LIL structure allows a faster distribution of the data to the different OpenMP threads (Dinkelbach et al, 2012 ). LIL and CSR representations have similar memory requirements, but LIL is more adapted to the dynamical addition or suppression of synapses: structural plasticity is very inefficient on the GPU platform and is currently disabled.…”
Section: Code Generationmentioning
confidence: 99%
See 3 more Smart Citations
“…On graphical cards with CUDA, the connectivity is stored in the compressed sparse row (CSR) format, where the values of each attribute are flattened into a single vector and a list of row pointers allow to attribute portions of this array to a single post-synaptic neuron (see Brette and Goodman, 2011 , for a review). These different data structures lead to a better parallel performance: CSR representations ensure a coalesced access to the attributes (i.e., the data is contiguous in memory), which is a strong condition for GPU computations to be efficient (Brette and Goodman, 2012 ), while the LIL structure allows a faster distribution of the data to the different OpenMP threads (Dinkelbach et al, 2012 ). LIL and CSR representations have similar memory requirements, but LIL is more adapted to the dynamical addition or suppression of synapses: structural plasticity is very inefficient on the GPU platform and is currently disabled.…”
Section: Code Generationmentioning
confidence: 99%
“…The weighted sum of inputs is for example executed in parallel over blocks of post-synaptic neurons with OpenMP. In contrast, parallel reduction is used in the CUDA implementation, as it leads to better performance (Dinkelbach et al, 2012 ). The main advantage of this code generation approach is that only the required steps are generated: spike-only mechanisms are skipped for rate-coded networks, as well as mechanisms for synaptic delays or structural plasticity if the network does not define them.…”
Section: Code Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…The efficiency of GPUs in accelerating simulations in the field of bioinformatics has already been recognized, [54]. The suitability of the network topology to the hardware appears to be a relevant issue which is discussed in [55], along with additional important parameters such as the network size and its connectivity, memory alignment and floating precision. Deep unsupervised learning in large-scale networks is presented in [56], evidencing a clear benefit of GPUs over CPUs.…”
Section: Architecture-level Realizationsmentioning
confidence: 99%