Breakthroughs from the field of deep learning are radically changing how sensor data are interpreted to extract the high-level information needed by mobile apps. It is critical that the gains in inference accuracy that deep models afford become embedded in future generations of mobile apps. In this work, we present the design and implementation of DeepX, a software accelerator for deep learning execution. DeepX significantly lowers the device resources (viz. memory, computation, energy) required by deep learning that currently act as a severe bottleneck to mobile adoption. The foundation of DeepX is a pair of resource control algorithms, designed for the inference stage of deep learning, that: (1) decompose monolithic deep model network architectures into unit-blocks of various types, that are then more efficiently executed by heterogeneous local device processors (e.g., GPUs, CPUs); and (2), perform principled resource scaling that adjusts the architecture of deep models to shape the overhead each unit-blocks introduces. Experiments show, DeepX can allow even large-scale deep learning models to execute efficiently on modern mobile processors and significantly outperform existing solutions, such as cloud-based offloading.
Detecting and reacting to user behavior and ambient context are core elements of many emerging mobile sensing and Internet-of-Things (IoT) applications. However, extracting accurate inferences from raw sensor data is challenging within the noisy and complex environments where these systems are deployed. Deep Learning-is one of the most promising approaches for overcoming this challenge, and achieving more robust and reliable inference. Techniques developed within this rapidly evolving area of machine learning are now state-of-the-art for many inference tasks (such as, audio sensing and computer vision) commonly needed by IoT and wearable applications. But currently deep learning algorithms are seldom used in mobile/IoT class hardware because they often impose debilitating levels of system overhead (e.g., memory, computation and energy). Efforts to address this barrier to deep learning adoption are slowed by our lack of a systematic understanding of how these algorithms behave at inference time on resource constrained hardware. In this paper, we present the first-albeit preliminary-measurement study of common deep learning models (such as Convolutional Neural Networks and Deep Neural Networks) on representative mobile and embedded platforms. The aim of this investigation is to begin to build knowledge of the performance characteristics, resource requirements and the execution bottlenecks for deep learning models when being used to recognize categories of behavior and context. The results and insights of this study, lay an empirical foundation for the development of optimization methods and execution environments that enable deep learning to be more readily integrated into next-generation IoT, smartphones and wearable systems.
Deep learning is having a transformative effect on how sensor data are processed and interpreted. As a result, it is becoming increasingly feasible to build sensor-based computational models that are much more robust to real-world noise and complexity than previously possible. It is paramount that these innovations reach mobile and embedded devices that often rely on understanding and reacting to sensor data. However, deep models conventionally demand a level of system resources (e.g., memory and computation) that makes them problematic to run directly on constrained devices.In this work, we present the DeepX toolkit (DXTK); an opensource collection of software components for simplifying the execution of deep models on resource-sensitive platforms. DXTK contains a number of pre-trained low-resource deep models that users can quickly adopt and integrate for their particular application needs. It also offers a range of runtime options for executing deep models on range of devices including both Android and Linux variants. But the heart of DXTK is a series of optimization techniques (viz. weight/sparse factorization, convolution separation, precision scaling, and parameter cleaning). Each technique offers a complementary approach to shaping system resource requirements, and is compatible with deep and convolutional neural networks. We hope that DXTK proves to be a valuable resource for the community, and accelerates the adoption and study of resource-constrained deep learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.