SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems

Dong, Xinyong; Salvo, B. De; Li, Meng; Liu, Chiao; Qu, Zhongnan; Kung, H. T.; Li, Ziyun

doi:10.48550/arxiv.2204.04705

Cited by 1 publication

(2 citation statements)

References 41 publications

(82 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, in the case of splitting NNs, trends from offloading can carry over to single-device multi-processor architectures. An emerging approach is the use of a nearsensor unit with pre-processing and compression layers (the encoder) and a central unit processing the bulk of computation (the decoder) (GOMEZ et al, 2022;DONG et al, 2022). According to Abrash (2021), this is pointed as key HW/SW evolution for enabling AR devices because the communication between sensor and compute elements can be the most significant factor in energy consumption.…”

Section: Offloading For Augmented Realitymentioning

confidence: 99%

“…Compression is usually achieved via the addition of an extraneous reduction component, which invalidates the assumption of a fixed architecture. One proposed solution is to use the ratio between encoder and decoder as a metric (DONG et al, 2022), as the tradeoff between computation and precision in the whole model should be monotonic.…”

Section: The Direct-inverse Tradeoffmentioning

confidence: 99%

See 1 more Smart Citation

Flexible collaborative inference for computer vision

Siloto Assine

View full text Add to dashboard Cite

Deep learning has dominated the last decade as the go-to technology for data processing. More than that, deep learning is also the current promise for replacing traditional compression algorithms. Uniting both these capabilities, studying image compression in scenarios where data will later be consumed by a deep neural network presents a unique frontier of exploration, with insights into how neural networks become efficient both in computation and information usage during inference. In this dissertation, we present a collection of 3 works that explore this frontier in the realm of embedded devices. We first introduce the notion of splitting neural networks as a form of compression for image classification exploring how the compressibility of representations evolve through the layers of a model. Later we study how this can be leveraged in object detection, and present a methodology for flexible models that accommodate fluctuating operational requirements of computation and bandwidth. Finally we take special attention to the role of this technology in augmented reality providing an yet improved design, also flexible and with great hardware/software synergy, based on an ensemble of encoders that can scaled in size at run-time. Designs are tested in a target device with ample comparison to the literature.

show abstract

Section: Offloading For Augmented Realitymentioning

confidence: 99%

Section: The Direct-inverse Tradeoffmentioning

confidence: 99%

Flexible collaborative inference for computer vision

Siloto Assine

View full text Add to dashboard Cite

show abstract

SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems

Cited by 1 publication

References 41 publications

Flexible collaborative inference for computer vision

Flexible collaborative inference for computer vision

Contact Info

Product

Resources

About