2022 14th International Conference on COMmunication Systems &Amp; NETworkS (COMSNETS) 2022
DOI: 10.1109/comsnets53615.2022.9668515
|View full text |Cite
|
Sign up to set email alerts
|

DEFER: Distributed Edge Inference for Deep Neural Networks

Abstract: Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standalone Kubernetes cluster to partition a given DNN and place these partitions in a distributed manner across an edge … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(22 citation statements)
references
References 13 publications
0
22
0
Order By: Relevance
“…Each worker node is mapped with only some of the partitions, which are processed and then reduced back to the central node, thus generating the input for the next layer considered in the Map procedure. The pipelined architecture [ 77 , 79 , 85 , 91 ] conceives the workflow as a sequence of n computation stages—corresponding to the n nodes in the hardware infrastructure—and n-1 communication steps for transferring intermediate results between adjacent devices [ 77 ]. Both the computation nodes and the execution flow are typically predetermined at configuration time, thereby simplifying task assignment into a mere sequential ordering, circumscribing the pursued objectives to find the split points that optimize the performance of the exploited CNN deployed.…”
Section: In Situ Distributed Intelligencementioning
confidence: 99%
See 3 more Smart Citations
“…Each worker node is mapped with only some of the partitions, which are processed and then reduced back to the central node, thus generating the input for the next layer considered in the Map procedure. The pipelined architecture [ 77 , 79 , 85 , 91 ] conceives the workflow as a sequence of n computation stages—corresponding to the n nodes in the hardware infrastructure—and n-1 communication steps for transferring intermediate results between adjacent devices [ 77 ]. Both the computation nodes and the execution flow are typically predetermined at configuration time, thereby simplifying task assignment into a mere sequential ordering, circumscribing the pursued objectives to find the split points that optimize the performance of the exploited CNN deployed.…”
Section: In Situ Distributed Intelligencementioning
confidence: 99%
“…Backed, in terms of the communication, by the pipelined architecture introduced in the previous section, pipeline parallelism [ 77 , 79 , 82 , 85 , 88 , 90 , 91 ] constitutes the simplest way to distribute the inference workload. It is a parallelism modality inherent to the traditional chain-like architecture of DNNs, which typically consists of a sequence of layers in which each layer’s output is dependent on the output provided by its previous layers.…”
Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning
confidence: 99%
See 2 more Smart Citations
“…Jouhari et al [14], in order to achieve inference of complex DNN models by unmanned aerial vehicles (UAVs) while avoiding air-and ground-generated additional communication delays, proposed a method for a dynamic collaborative DNN model inference by UAV air-to-air communication, which improved the real-time DNN inference while effectively utilizing the storage and computational resources of UAVs. DEFER [15] proposed a distributed edge inference framework to partition the model and perform distributed inference on resource-constrained devices, effectively reducing the device energy consumption.…”
Section: D2d Inferencementioning
confidence: 99%