2017
DOI: 10.3390/s17102172
|View full text |Cite
|
Sign up to set email alerts
|

A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors

Abstract: In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…3) Bulk Synchronous Parallel: In distributed computing systems, each computing node has different computing power than other nodes in the system due to real-world environment. For this reason, the distributed ML training uses an iteration to coordinate the synchronization between all computer nodes [12]. In the synchronous update known as bulk synchronous parallel (BSP) [13], the replicas submit the gradients after locally training process at every iteration or mini-batch to global model parameters or to other replicas.…”
Section: A Artificial Neural Networkmentioning
confidence: 99%
“…3) Bulk Synchronous Parallel: In distributed computing systems, each computing node has different computing power than other nodes in the system due to real-world environment. For this reason, the distributed ML training uses an iteration to coordinate the synchronization between all computer nodes [12]. In the synchronous update known as bulk synchronous parallel (BSP) [13], the replicas submit the gradients after locally training process at every iteration or mini-batch to global model parameters or to other replicas.…”
Section: A Artificial Neural Networkmentioning
confidence: 99%
“…If the worker node does not reach the stale threshold during the training process, the accuracy cannot be guaranteed. Our previous work improved it and proposed a dynamic synchronous parallel model based on dynamic finite fault tolerance [41], [42]. The DSP adds an optional condition for entering the synchronization barrier.…”
Section: Consistency Modelmentioning
confidence: 99%