1995
DOI: 10.1109/72.363436
|View full text |Cite
|
Sign up to set email alerts
|

Fast neural net simulation with a DSP processor array

Abstract: This paper describes the implementation of a fast neural net simulator on a novel parallel distributed-memory computer. A 60-processor system, named MUSIC (multiprocessor system with intelligent communication), is operational and runs the backpropagation algorithm at a speed of 330 million connection updates per second (continuous weight update) using 32-b floating-point precision. This is equal to 1.4 Gflops sustained performance. The complete system with 3.8 Gflops peak performance consumes less than 800 W o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

1996
1996
2015
2015

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…In the stochastic version the gradient estimates are noisy, but the parameters are updated much more often than with the batch version. An empirical result of considerable practical importance is that on tasks with large, redundant data sets, the stochastic version is considerably faster than the batch version, sometimes by orders of magnitude [117]. Although the reasons for this are not totally understood theoretically, an intuitive explanation can be found in the following extreme example.…”
Section: Appendix B Stochastic Gradient Versus Batch Gradientmentioning
confidence: 99%
“…In the stochastic version the gradient estimates are noisy, but the parameters are updated much more often than with the batch version. An empirical result of considerable practical importance is that on tasks with large, redundant data sets, the stochastic version is considerably faster than the batch version, sometimes by orders of magnitude [117]. Although the reasons for this are not totally understood theoretically, an intuitive explanation can be found in the following extreme example.…”
Section: Appendix B Stochastic Gradient Versus Batch Gradientmentioning
confidence: 99%
“…While the previous millennium saw several attempts at creating fast NN-specific hardware (e.g., Jackel et al, 1990;Faggin, 1992;Ramacher et al, 1993;Widrow et al, 1994;Heemskerk, 1995;Korkin et al, 1997;Urlbe, 1999), and at exploiting standard hardware (e.g., Anguita et al, 1994;Muller et al, 1995;Anguita and Gomes, 1996), the new millennium brought a DL breakthrough in form of cheap, multiprocessor graphics cards or GPUs. GPUs are widely used for video games, a huge and competitive market that has driven down hardware prices.…”
Section: Fast Graphics Processing Units (Gpus) For DL In Nnsmentioning
confidence: 99%
“…On large, redundant data sets, the online version converges much faster then the batch version, sometimes by orders of magnitude (Müller, Gunzinger and Guggenbühl, 1995). An intuitive explanation can be found in the following extreme example.…”
Section: Multi-layer Networkmentioning
confidence: 97%
“…In these early days, the algorithmic simplicity of online algorithms was a requirement. This is still the case when it comes to handling large, real-life training sets (Müller, Gunzinger and Guggenbühl, 1995).…”
Section: Introductionmentioning
confidence: 99%