LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference in Image Recognition

Hadidi, Ramyad; Asgari, Bahar; Cao, Jiashen; Bae, Younmin; Shim, Da Eun; Kim, Hyojong; Lim, Sung Kyu; Ryoo, Michael S.; Kim, Hyesoon

doi:10.48550/arxiv.2003.06464

Cited by 1 publication

(13 citation statements)

References 66 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Data parallelism is a straightforward method that allows independent data to be processed concurrently by duplicating devices that perform the same task or, in other words, that utilize the same DNN model. This could facilitate the reusability of data and increase the inference throughput [ 75 ]—the number of inferences per unit of time—nearly two-fold in the best-case scenario. Nonetheless, as evidenced by its single occurrence in the corpus analyzed [ 72 ]—shown in the “Parallelism” column in Table 3 —implementing and applying such a parallelism scheme may not be effective for devices with limited resources or scenarios involving IoT devices because it does not change the computation and memory footprint per node [ 75 , 76 ] and it lacks the flexibility required to adjust the amount of computation on each node [ 76 ].…”

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

“…Model parallelism [ 70 , 71 , 72 , 73 , 74 , 75 , 76 , 78 , 80 , 81 , 83 , 84 , 86 , 87 , 89 ] attempts to address both issues by employing partitioning strategies with a finer granularity to produce less expensive DNN subtasks, i.e., partitions with fewer parameters and fewer computation requirements than a layer, and foster more adaptable co-inference schemes. The computations required for a single input are distributed across multiple computing entities, reducing the time needed to process the shared input [ 76 ] but delivering a performance that, as opposed to what has been indicated for pipeline parallelism, is highly dependent on the distribution of such computations across devices.…”

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

“…Table 3 provides valuable insights regarding model parallelism, clearly linking it to intra-layer partitioning. Only two research efforts [ 75 , 78 ] propose an approach halfway between model parallelism and pipeline parallelism, in the sense that, like pipeline parallelism, the approach relies on inter-layer partitioning, generating, in this case, branches that only communicate for input and pre-final activation, and can therefore be executed simultaneously on different devices, as model parallelism advocates.…”

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

“…Output-based partitioning parallelizes the generation of each activation by distributing the computation of each output across multiple target devices [ 75 ] while copying all input values to each [ 76 ]. Weights sharing the same output neurons are assigned to the same computing entity, whereas intermediate results generated by preceding layers are distributed across all devices [ 83 ].…”

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

“…Input-based partitioning, in turn, splits the input across devices, resulting in lower communication overhead compared to the output-based approach. Each node calculates activations only partially, computing merely those portions of the output that are dependent on the received input [ 75 ]. Part of the input is transmitted to each device, which holds the weights corresponding to the specific input partition to be processed.…”

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

See 4 more Smart Citations

Horizontally Distributed Inference of Deep Neural Networks for AI-Enabled IoT

Rodriguez-Conde

Campos

Fdez-Riverola

2023

Sensors

View full text Add to dashboard Cite

Motivated by the pervasiveness of artificial intelligence (AI) and the Internet of Things (IoT) in the current “smart everything” scenario, this article provides a comprehensive overview of the most recent research at the intersection of both domains, focusing on the design and development of specific mechanisms for enabling a collaborative inference across edge devices towards the in situ execution of highly complex state-of-the-art deep neural networks (DNNs), despite the resource-constrained nature of such infrastructures. In particular, the review discusses the most salient approaches conceived along those lines, elaborating on the specificities of the partitioning schemes and the parallelism paradigms explored, providing an organized and schematic discussion of the underlying workflows and associated communication patterns, as well as the architectural aspects of the DNNs that have driven the design of such techniques, while also highlighting both the primary challenges encountered at the design and operational levels and the specific adjustments or enhancements explored in response to them.

show abstract

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

Section: Dnn Partitioning and Parallelism For Collaborative Inferencementioning

confidence: 99%

See 3 more Smart Citations

Horizontally Distributed Inference of Deep Neural Networks for AI-Enabled IoT

Rodriguez-Conde

Campos

Fdez-Riverola

2023

Sensors

View full text Add to dashboard Cite

show abstract

LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference in Image Recognition

Cited by 1 publication

References 66 publications

Horizontally Distributed Inference of Deep Neural Networks for AI-Enabled IoT

Horizontally Distributed Inference of Deep Neural Networks for AI-Enabled IoT

Contact Info

Product

Resources

About