Most of the research on deep neural networks (DNNs) so far has been focused on obtaining higher accuracy levels by building increasingly large and deep architectures. Training and evaluating these models is only feasible when large amounts of resources such as processing power and memory are available. Typical applications that could benefit from these models are however executed on resource constrained devices. Mobile devices such as smartphones already use deep learning techniques but they often have to perform all processing on a remote cloud. We propose a new architecture called a Cascading network that is capable of distributing a deep neural network between a local device and the cloud while keeping the required communication network traffic to a minimum. The network begins processing on the constrained device and only relies on the remote part when the local part does not provide an accurate enough result. The Cascading network allows for an early stopping mechanism during the recall phase of the network. We evaluated our approach in an Internet Of Things (IoT) context where a deep neural network adds intelligence to a large amount of heterogeneous connected devices. This technique enables a whole variety of autonomous systems where sensors, actuators and computing nodes can work together. We show that the Cascading architecture allows for a substantial improvement in evaluation speed on constrained devices while the loss in accuracy is kept to a minimum.
Abstract. Nowadays deep neural networks are widely used to accurately classify input data. An interesting application area is the Internet of Things (IoT), where a massive amount of sensor data has to be classified. The processing power of the cloud is attractive, however the variable latency imposes a major drawback for neural networks. In order to exploit the apparent trade-off between utilizing the available though limited embedded computing power of the IoT devices at high speed/stable latency and the seemingly unlimited computing power of Cloud computing at the cost of higher and variable latency, we propose a Big-Little architecture for deep neural networks. A small neural network trained to a subset of prioritized output classes can be used to calculate an output on the embedded devices, while a more specific classification can be calculated in the Cloud only when required. We show the applicability of this concept in the IoT domain by evaluating our approach for state of the art neural network classification problems on popular embedded devices such as the Raspberry Pi and Intel Edison.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2024 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.