As mobile devices become more prevalent, users tend to reassess their expectations regarding the personalization of mobile services. The data collected by a mobile device's sensors provide an opportunity to gain insight into the user's profile. Recently, deep learning has gained momentum and has become the method of choice for solving machine learning problems. Interestingly, training a deep neural network on a mobile device is often mistakenly regarded as cumbersome.For instance, several deep learning frameworks only provide a CPU-based implementation for prediction tasks on a mobile device. In contrast to servers, a mobile computing environment imposes many domain-specific constraints that invite us to review the general computing approach used in a deep learning framework implementation. In this paper, we propose a deep learning framework that has been specifically designed for mobile device platforms. Our approach relies on the collaboration of the multicore CPU and the integrated GPU to accelerate deep learning computation on mobile devices. Our work exploits the shared memory architecture of mobile devices to promote CPU-GPU collaboration without any data copying. We analyze our approach with regard to three factors: performance/portability trade-off, power efficiency, and memory management. KEYWORDS deep learning, energy efficient, GPGPU, heterogeneous system, mobile computing, OpenCL
INTRODUCTIONAs mobile devices get more sophisticated, the need for more optimization and personalization of mobile services has become a primary expectation of users. The latest mobile devices provide the opportunity to collect information about the user throughout the day via multiple mobile sensors, wearable sensors, and the Internet of things (IoT). In contrast, desktop and laptop computers are limited in the type of data that can be collected. For instance, even if laptops are suitable for mobile use, most do not have GPS sensors and are therefore not location aware. The aggregation of the collected data provides an opportunity to gain insight into mobile users' profiles and then personalize the mobile experience of a particular user.In recent years, deep learning has become the method of choice for solving complex representation learning problems. 1 The computational model comprises a set of processing layers, each of which successively transforms its input into a slightly more abstract representation than that in the previous layer. One fundamental aspect of deep learning is that each layer relies on a simple module that does not require any particular domain expertise for its design. This stack of processing layers attempts to produce a new set of input data that reduces irrelevant variations and amplifies discriminative information. Using a layer composition, deep learning methods can learn complex nonlinear functions by exploiting a general-purpose learning procedure. Deep learning is now regarded as the state-of-the-art method for solving many complex problems, such as image processing.The authors claim that deep le...