CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge

Fang, Yihao; Shalmani, Shervin Manzuri; Zheng, Rong

doi:10.48550/arxiv.2007.01793

Cited by 5 publications

(6 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To achieve the same accuracy level of 90% with traditional training, we had to increase the resolution and scaling, resulting in a configuration of N N ⟨r =1, s=2⟩ that was 6.6× bigger (35.6 MB). Furthermore, compared to image classification tasks based on CIFAR-10 deployed on the target device as described in [11] and [12], our approach achieved approximately 1.6× and 1.9× better frame rate, respectively, with comparable (90% vs. 93% for [12]) or better accuracy (90% vs. 82.9% for [11]).…”

Section: Multi-objective Solutionsmentioning

confidence: 96%

Reg-TuneV2: A Hardware-Aware and Multiobjective Regression-Based Fine-Tuning Approach for Deep Neural Networks on Embedded Platforms

Mazumder,

Mohsenin

2023

IEEE Micro

View full text Add to dashboard Cite

show abstract

Section: Multi-objective Solutionsmentioning

confidence: 96%

Reg-TuneV2: A Hardware-Aware and Multiobjective Regression-Based Fine-Tuning Approach for Deep Neural Networks on Embedded Platforms

Mazumder,

Mohsenin

2023

IEEE Micro

View full text Add to dashboard Cite

show abstract

“…Kumar [ 21 ] observed that caching intermediate layer outputs can help avoid running all the layers of a DNN for a sizeable fraction of inference requests [ 22 ]. They proposed approximate caching (in different domains [ 23 ]) at each intermediate layer.…”

Section: Related Workmentioning

confidence: 99%

A Reverse Modification Method for Binary Code and Data

Duan

2022

Sensors

View full text Add to dashboard Cite

This paper reveals the hidden dangers of reverse data modifications on distributed software with network synchronization, during the era of 5G, which may occur in more important domains, such as telemedicine and automatic driving. We used pseudo-codes to formally elaborate the distributed software architectures and design patterns. It is necessary to deal with three challenges for the modification of binary code and data in the distributed software architectures: (1) the base virtual addresses of software objects are changed frequently for safety; (2) prior knowledge of the reverse is not considered; (3) system memory values of some target objects are changed with extreme speed. For this purpose, a novel reverse modification method for binary code and data is proposed. According to the knowledge-based rules, our method can manipulate physical data, sight data, animation data, etc., while the game synchronization mechanism cannot detect the changes. The implementation details of our method are presented using high-level programming languages (C++) and low-level programming languages (assembly), based on multiple snippets, so that readers can understand both the overall distributed software developments and the corresponding reverse processes. In particular, two network games are used for the demonstrations in this paper. The demonstration results show that our proposed methodology is efficient (as proved by formulas and practices) to manipulate the codes and data of distributed software using a synchronization mechanism.

show abstract

“…For device-level caching, researchers investigate the spatio-temporal locality of users within an area to cache repeatedly requested computation results on edge server [47]- [49], [130]- [133]. In addition, there are also some works propose to cache multiple deep learning models on edge server for specialized missions to improve the quality of service [134]- [137]. Some applications on eye gaze tracking [138] and voice assistant [139], [140] based on computation caching are introduced at last.…”

Section: A Edge Cachingmentioning

confidence: 99%

“…Aforementioned solutions require users to upload data to edge server for processing, which leads to relatively high latency (but much lower than cloud-based solutions). Fang et al propose a caching scheme with the joint consideration of latency and accuracy [137]. A complex model is partitioned and distributed between edge devices and the cloud server.…”

Section: B Computation Cachingmentioning

confidence: 99%

Edge Intelligence: Empowering Intelligence to the Edge of Network

et al. 2021

Proc. IEEE

125

View full text Add to dashboard Cite

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data is captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protect the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e. edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This article provides a comprehensive survey to edge intelligence and its application areas. In addition, we summarise the development of the emerging research fields and the current stateof-the-art and discuss the important open issues and possible theoretical and technical directions.

show abstract

CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge

Cited by 5 publications

References 19 publications

Reg-TuneV2: A Hardware-Aware and Multiobjective Regression-Based Fine-Tuning Approach for Deep Neural Networks on Embedded Platforms

Reg-TuneV2: A Hardware-Aware and Multiobjective Regression-Based Fine-Tuning Approach for Deep Neural Networks on Embedded Platforms

A Reverse Modification Method for Binary Code and Data

Edge Intelligence: Empowering Intelligence to the Edge of Network

Contact Info

Product

Resources

About