2018
DOI: 10.48550/arxiv.1804.10574
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Decoupled Parallel Backpropagation with Convergence Guarantee

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
21
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(22 citation statements)
references
References 0 publications
1
21
0
Order By: Relevance
“…Remark 1 We remark that a different definition of the speedup ratio has been applied by (Huo et al, 2018b). That is, by terminating the parallel training process once its testing accuracy is comparable to that of layer-serial training method, the speedup can then be defined as the ratio of serial execution time to the parallel execution time.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Remark 1 We remark that a different definition of the speedup ratio has been applied by (Huo et al, 2018b). That is, by terminating the parallel training process once its testing accuracy is comparable to that of layer-serial training method, the speedup can then be defined as the ratio of serial execution time to the parallel execution time.…”
Section: Methodsmentioning
confidence: 99%
“…We then compare our joint learning approach with other representative methods from literature for solving the image classification task (1) across CIFAR-10 dataset. Recall from Table 2 or Table 1 that the algorithmic locking issues of BP are not addressed by neither PipeDream nor Gpipe while the loss function of DGL methods is not consistent with the original learning task, the comparison is therefore made against the straight-forward implementation of penalty method without using data augmentation (Gotmare et al, 2018), FR (Huo et al, 2018a), DDG (Huo et al, 2018b) and BP (He et al, 2016a). We follow exactly the same setup from (Huo et al, 2018b,a) and report the experimental results in Table 9.…”
Section: Comparison To Other Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We survey three categories of BP literature-(i) better hardware implementation of BP [15,16,31,11,32,25], (ii) workarounds to approximate BP [33,7,10], and (iii) biologically inspired algorithms. Biologically inspired algorithms can further be segregated into four types: (i) Inspired from biological observations [29,7,26,17], these works try to approximate BP with the intention resolve its biological implausibility, (ii) Propagation of an alternative to error [19,21], (iii) Leveraging local errors, the power of single layer networks, and layer wise pre-training to approximate BP [24,23,3], (iv) Resolving the locking problem using decoupling [14,6,12,1,20] and its variants [27,8,22,4]. We were deeply motivated by (ii), (iii), and (iv) while coming up with the idea of 'front contributions'-specifically, propagating something other than error, the idea of a single layer network, and decoupling, collectively inspire 'front contributions'.…”
Section: Introduction and Related Workmentioning
confidence: 99%