AI Benchmark: Running Deep Neural Networks on Android Smartphones

Ignatov, Andrey; Timofte, Radu; Chou, William; Wang, Ke; Wu, Max C.; Hartley, Tim; Gool, Luc Van

doi:10.48550/arxiv.1810.01109

Cited by 8 publications

(12 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Table 2 summarizes their specifications 5 . Note we only use the smartphone with DSP rather than that with NPU, since 1) NPUs are only programmable with vendor-provided Software Development Kits (SDKs) which have not been publicly released yet [42], and 2) DSPs in recent mobile SoCs are optimized for DNN inference so that they can act as NPUs [42,77].…”

Section: Real System Measurement Infrastructurementioning

confidence: 99%

“…To address these performance and energy efficiency challenges, modern mobile devices employ more and more accelerators and/or co-processors, such as Graphic Processing Units (GPU), Digital Signal Processors (DSPs), and Neural Processing Units (NPUs) [10,42], scaling up the overall system performance. Furthermore, the mobile system stack support for DNNs has become more mature, allowing DNN inference to leverage the computation and energy efficiency advantages provided by the co-processors.…”

Section: Introductionmentioning

confidence: 99%

“…Furthermore, the mobile system stack support for DNNs has become more mature, allowing DNN inference to leverage the computation and energy efficiency advantages provided by the co-processors. For example, modern deep learning compiler and programming stacks, such as TVM [10], SNPE [77], and Android NN API [2,42], enable inference execution on a diverse set of hardware back-ends.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

Kim,

2020

Preprint

View full text Add to dashboard Cite

Deep learning inference is increasingly run at the edge. As the programming and system stack support becomes mature, it enables acceleration opportunities within a mobile system, where the system performance envelope is scaled up with a plethora of programmable co-processors. Thus, intelligent services designed for mobile users can choose between running inference on the CPU or any of the co-processors on the mobile system, or exploiting connected systems, such as the cloud or a nearby, locally connected system. By doing so, the services can scale out the performance and increase the energy efficiency of edge mobile systems. This gives rise to a new challenge-deciding when inference should run where. Such execution scaling decision becomes more complicated with the stochastic nature of mobile-cloud execution, where signal strength variations of the wireless networks and resource interference can significantly affect real-time inference performance and system energy efficiency. To enable accurate, energy-efficient deep learning inference at the edge, this paper proposes AutoScale. AutoScale is an adaptive and light-weight execution scaling engine built upon the customdesigned reinforcement learning algorithm. It continuously learns and selects the most energy-efficient inference execution target by taking into account characteristics of neural networks and available systems in the collaborative cloudedge execution environment while adapting to the stochastic runtime variance. Real system implementation and evaluation, considering realistic execution scenarios, demonstrate an average of 9.8 and 1.6 times energy efficiency improvement for DNN edge inference over the baseline mobile CPU and cloud offloading, while meeting the real-time performance and accuracy requirement.

show abstract

Section: Real System Measurement Infrastructurementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

Kim,

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…The common approach for running the current AI-based mobile applications is to send the data from the mobile device (front-end, or edge) to the cloud (back-end), run AI models in the cloud, and send the results back. New generations of mobile devices are capable of running deep neural models on board [1], however, this presents a great challenge due to limited power and computing resources available. In [2], it is shown that in many cases, the optimal strategy in terms of energy consumption and computation latency is to split the deep model and distribute the computation between the front-end and the back-end.…”

Section: Introductionmentioning

confidence: 99%

Multi-Task Learning with Compressible Features for Collaborative Intelligence

Alvar

Bajić

2019

2019 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

A promising way to deploy Artificial Intelligence (AI)-based services on mobile devices is to run a part of the AI model (a deep neural network) on the mobile itself, and the rest in the cloud. This is sometimes referred to as collaborative intelligence. In this framework, intermediate features from the deep network need to be transmitted to the cloud for further processing. We study the case where such features are used for multiple purposes in the cloud (multi-tasking) and where they need to be compressible in order to allow efficient transmission to the cloud. To this end, we introduce a new loss function that encourages feature compressibility while improving system performance on multiple tasks. Experimental results show that with the compression-friendly loss, one can achieve around 20% bitrate reduction without sacrificing the performance on several vision-related tasks.

show abstract

“…We have witnessed the wide spread of deep neural networks and their applications in the last decade. With ever growing computing power of embedded or edge devices, neural networks are being adopted to such devices, further assisted by AI accelerators [2,10,24,25,27,38,44,48]. In mobile phone industry, this trend has already become obvious enough to be adopted by major manufacturers including Samsung and Apple [2,44].…”

Section: Introductionmentioning

confidence: 99%

NNStreamer: Stream Processing Paradigm for Neural Networks, Toward Efficient Development and Execution of On-Device AI Applications

Ham,

Moon,

Lim

et al. 2019

Preprint

View full text Add to dashboard Cite

We propose nnstreamer, a software system that handles neural networks as filters of stream pipelines, applying the stream processing paradigm to neural network applications. A new trend with the wide-spread of deep neural network applications is on-device AI; i.e., processing neural networks directly on mobile devices or edge/IoT devices instead of cloud servers. Emerging privacy issues, data transmission costs, and operational costs signifies the need for on-device AI especially when a huge number of devices with real-time data processing are deployed. Nnstreamer efficiently handles neural networks with complex data stream pipelines on devices, improving the overall performance significantly with minimal efforts. Besides, nnstreamer simplifies the neural network pipeline implementations and allows reusing off-shelf multimedia stream filters directly; thus it reduces the developmental costs significantly. Nnstreamer is already being deployed with a product releasing soon and is open source software applicable to a wide range of hardware architectures and software platforms.

show abstract

AI Benchmark: Running Deep Neural Networks on Android Smartphones

Cited by 8 publications

References 0 publications

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

Multi-Task Learning with Compressible Features for Collaborative Intelligence

NNStreamer: Stream Processing Paradigm for Neural Networks, Toward Efficient Development and Execution of On-Device AI Applications

Contact Info

Product

Resources

About