2021
DOI: 10.48550/arxiv.2112.05387
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

Abstract: Gradient-based methods for the distributed training of residual networks (ResNets) typically require a forward pass of the input data, followed by back-propagating the error gradient to update model parameters, which becomes time-consuming as the network goes deeper. To break the algorithmic locking and exploit synchronous module parallelism in both the forward and backward modes, auxiliary-variable methods have attracted much interest lately but suffer from significant communication overhead and lack of data … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 29 publications
(62 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?