2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) 2022
DOI: 10.1109/asp-dac52403.2022.9712574
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Computer Vision on Edge Devices with Pipeline-Parallel Hierarchical Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…Backed, in terms of the communication, by the pipelined architecture introduced in the previous section, pipeline parallelism [ 77 , 79 , 82 , 85 , 88 , 90 , 91 ] constitutes the simplest way to distribute the inference workload. It is a parallelism modality inherent to the traditional chain-like architecture of DNNs, which typically consists of a sequence of layers in which each layer’s output is dependent on the output provided by its previous layers.…”
Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning
confidence: 99%
See 3 more Smart Citations
“…Backed, in terms of the communication, by the pipelined architecture introduced in the previous section, pipeline parallelism [ 77 , 79 , 82 , 85 , 88 , 90 , 91 ] constitutes the simplest way to distribute the inference workload. It is a parallelism modality inherent to the traditional chain-like architecture of DNNs, which typically consists of a sequence of layers in which each layer’s output is dependent on the output provided by its previous layers.…”
Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning
confidence: 99%
“…More specifically, the objective of horizontally distributing a DNN to parallelize the computations required for inference is to improve the performance of the entire system, mainly in terms of time-specific metrics. Essentially it seeks to minimize latency [ 70 , 71 , 75 , 78 , 79 , 80 , 81 , 82 , 87 , 88 , 89 , 90 ] or maximize throughput, i.e., the number of performed inferences per second, when dealing with data streams [ 72 , 76 , 77 , 85 ], and, to a lesser extent and in terms of energy efficiency, to minimize power consumption in a few studies [ 79 , 85 , 86 ]. In particular, as regards the objective of minimizing latency, in nearly all the studies analyzed, this refers to the time cost incurred for the end-to-end execution of the exploited DNN model, being computed as the sum of the computation cost derived from the execution of the DNN partitions in the different devices involved and the transmission cost reflecting the time necessary to communicate intermediate results between nodes.…”
Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning
confidence: 99%
See 2 more Smart Citations
“…Therefore, we need to consider how to minimize the total cost of device-device collaborative inference in application scenarios to improve task performance. Goel et al [81] verify that the hierarchical DNN architecture is very suitable for parallel processing on multiple edge devices, and created a parallel inference system for computer vision problems of hierarchical DNN. The method balances the load between cooperative devices and reduces the communication cost, so as to process multiple video frames at the same time with higher throughput.…”
Section: Total Cost Minimizationmentioning
confidence: 99%