Fast neural net simulation with a DSP processor array

Müller, Urs; Gunzinger, Anton; Guggenbühl, W.

doi:10.1109/72.363436

Cited by 40 publications

(9 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the stochastic version the gradient estimates are noisy, but the parameters are updated much more often than with the batch version. An empirical result of considerable practical importance is that on tasks with large, redundant data sets, the stochastic version is considerably faster than the batch version, sometimes by orders of magnitude [117]. Although the reasons for this are not totally understood theoretically, an intuitive explanation can be found in the following extreme example.…”

Section: Appendix B Stochastic Gradient Versus Batch Gradientmentioning

confidence: 99%

Gradient-based learning applied to document recognition

et al. 1998

View full text Add to dashboard Cite

Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of two dimensional (2-D) shapes, are shown to outperform all other techniques.Real-life document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN's), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure.Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks.A graph transformer network for reading a bank check is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.

show abstract

Section: Appendix B Stochastic Gradient Versus Batch Gradientmentioning

confidence: 99%

Gradient-based learning applied to document recognition

et al. 1998

View full text Add to dashboard Cite

show abstract

“…While the previous millennium saw several attempts at creating fast NN-specific hardware (e.g., Jackel et al, 1990;Faggin, 1992;Ramacher et al, 1993;Widrow et al, 1994;Heemskerk, 1995;Korkin et al, 1997;Urlbe, 1999), and at exploiting standard hardware (e.g., Anguita et al, 1994;Muller et al, 1995;Anguita and Gomes, 1996), the new millennium brought a DL breakthrough in form of cheap, multiprocessor graphics cards or GPUs. GPUs are widely used for video games, a huge and competitive market that has driven down hardware prices.…”

Section: Fast Graphics Processing Units (Gpus) For DL In Nnsmentioning

confidence: 99%

Deep learning in neural networks: An overview

2015

View full text Add to dashboard Cite

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.LATEX source: http://www.idsia.ch/˜juergen/DeepLearning8Oct2014.tex Complete BIBTEX file (888 kB): http://www.idsia.ch/˜juergen/deep.bib Preface This is the preprint of an invited Deep Learning (DL) overview. One of its goals is to assign credit to those who contributed to the present state of the art. I acknowledge the limitations of attempting to achieve this goal. The DL research community itself may be viewed as a continually evolving, deep network of scientists who have influenced each other in complex ways. Starting from recent DL results, I tried to trace back the origins of relevant ideas through the past half century and beyond, sometimes using "local search" to follow citations of citations backwards in time. Since not all DL publications properly acknowledge earlier relevant work, additional global search strategies were employed, aided by consulting numerous neural network experts. As a result, the present preprint mostly consists of references. Nevertheless, through an expert selection bias I may have missed important work. A related bias was surely introduced by my special familiarity with the work of my own DL research group in the past quarter-century. For these reasons, this work should be viewed as merely a snapshot of an ongoing credit assignment process. To help improve it, please do not hesitate to send corrections and suggestions to juergen@idsia.ch.

show abstract

“…On large, redundant data sets, the online version converges much faster then the batch version, sometimes by orders of magnitude (Müller, Gunzinger and Guggenbühl, 1995). An intuitive explanation can be found in the following extreme example.…”

Section: Multi-layer Networkmentioning

confidence: 97%

“…In these early days, the algorithmic simplicity of online algorithms was a requirement. This is still the case when it comes to handling large, real-life training sets (Müller, Gunzinger and Guggenbühl, 1995).…”

Section: Introductionmentioning

confidence: 99%

On-line Learning and Stochastic Approximations

Bottou¹

1999

On-Line Learning in Neural Networks

707

836

View full text Add to dashboard Cite

The convergence of online learning algorithms is analyzed using the tools of the stochastic approximation theory, and proved under very weak conditions. A general framework for online learning algorithms is first presented. This framework encompasses the most common online learning algorithms in use today, as illustrated by several examples. The stochastic approximation theory then provides general results describing the convergence of all these learning algorithms at once.

show abstract

Fast neural net simulation with a DSP processor array

Cited by 40 publications

References 15 publications

Gradient-based learning applied to document recognition

Gradient-based learning applied to document recognition

Deep learning in neural networks: An overview

On-line Learning and Stochastic Approximations

Contact Info

Product

Resources

About