2017
DOI: 10.3389/fnins.2017.00496
|View full text |Cite
|
Sign up to set email alerts
|

Hardware-Efficient On-line Learning through Pipelined Truncated-Error Backpropagation in Binary-State Networks

Abstract: Artificial neural networks (ANNs) trained using backpropagation are powerful learning architectures that have achieved state-of-the-art performance in various benchmarks. Significant effort has been devoted to developing custom silicon devices to accelerate inference in ANNs. Accelerating the training phase, however, has attracted relatively little attention. In this paper, we describe a hardware-efficient on-line learning technique for feedforward multi-layer ANNs that is based on pipelined backpropagation. L… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 47 publications
0
10
0
Order By: Relevance
“…Uniformly distributed pseudo random numbers can be generated cheaply using a linear feedback shift register (LFSR) Klein ( 2013 ) as implemented in ref. Mostafa et al ( 2017 ) on FPGAs. Generating new random numbers from LFSRs is very computationally cheap as it involves only few bit-wise XOR operations and no MAC operations.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Uniformly distributed pseudo random numbers can be generated cheaply using a linear feedback shift register (LFSR) Klein ( 2013 ) as implemented in ref. Mostafa et al ( 2017 ) on FPGAs. Generating new random numbers from LFSRs is very computationally cheap as it involves only few bit-wise XOR operations and no MAC operations.…”
Section: Methodsmentioning
confidence: 99%
“…For example, in the scheme used in Cauwenberghs ( 1996 ), the number of flip flops needed to produce N b random bits per cycle grows with . This scheme was adopted in Mostafa et al ( 2017 ) that used 60 flip flops and 650 XOR gates to generate 320 random bits every clock cycle. In the training experiments reported in this paper, we did not use an explicit LFSR, but instead used a standard software-based random number generator.…”
Section: Methodsmentioning
confidence: 99%
“…A proposed solution to reduce the computational complexity and optimize memory resources is the use of pipelined backpropagation [31] and binary state network [32]. In binary state network, the output of a neuron can be unipolar (0/1) or bipolar (−1/1) binary.…”
Section: B Backpropagation and Variantsmentioning
confidence: 99%
“…However, they realized the idea for only a 3-layer perceptron on a torus of 16 processors. Mostafa et al [13] implemented a proof-ofconcept validation of pipelined backpropagation training for a 3-layer fully connected binary-state neural network with truncated-error FPGA. However, the implementation does not have the coarse-grained layer-wise pipelined parallelization.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Existing pipelined training approaches either avoid the use of stale weights (e.g., with the use of microbatches [8]), constrain the training to ensure the consistency of the weights within an accelerator (e.g., using weight stashing [9]), utilize weight adjustments (e.g., weight prediction [11]), or limit the use of pipelining to very small networks (e.g., [13]). However, these approaches underutilize accelerators [8], inflate memory usage to stash multiple copies of weights [9], or are unable to handle large networks [13].…”
Section: Introductionmentioning
confidence: 99%