Internet of Things (IoT) devices can apply mobileedge computing (MEC) and energy harvesting (EH) to provide the satisfactory quality of experiences for computation intensive applications and prolong the battery lifetime. In this article, we investigate the computation offloading for IoT devices with energy harvesting in wireless networks with multiple MEC devices such as base stations and access points, each with different computation resource and radio communication capability. We propose a reinforcement learning based computation offloading framework for an IoT device to choose the MEC device and determine the offloading rate according to the current battery level, the previous radio bandwidth to each MEC device and the predicted amount of the harvested energy. A "hotbooting" Qlearning based computation offloading scheme is proposed for an IoT device to achieve the optimal offloading performance without being aware of the MEC model, the energy consumption and computation latency model. We also propose a fast deep Q-network (DQN) based offloading scheme, which combines the deep learning and hotbooting techniques to accelerate the learning speed of Q-learning. We show that the proposed schemes can achieve the optimal offloading policy after sufficiently long learning time and provide their performance bounds under two typical MEC scenarios. Simulations are performed for IoT devices that use wireless power transfer to capture the ambient radio-frequency signals to charge the IoT batteries. Simulation results show that the fast DQN-based offloading scheme reduces the energy consumption, decreases the computation delay and the task drop ratio, and increases the utility of the IoT device in dynamic MEC, compared with the benchmark Q-learning based offloading.
Advanced Persistent Threat (APT) attackers apply multiple sophisticated methods to continuously and stealthily steal information from the targeted cloud storage systems and can even induce the storage system to apply a specific defense strategy and attack it accordingly. In this paper, the interactions between an APT attacker and a defender allocating their Central Processing Units (CPUs) over multiple storage devices in a cloud storage system are formulated as a Colonel Blotto game. The Nash equilibria (NEs) of the CPU allocation game are derived for both symmetric and asymmetric CPUs between the APT attacker and the defender to evaluate how the limited CPU resources, the date storage size and the number of storage devices impact the expected data protection level and the utility of the cloud storage system. A CPU allocation scheme based on "hotbooting" policy hill-climbing (PHC) that exploits the experiences in similar scenarios to initialize the quality values to accelerate the learning speed is proposed for the defender to achieve the optimal APT defense performance in the dynamic game without being aware of the APT attack model and the data storage model. A hotbooting deep Q-network (DQN)-based CPU allocation scheme further improves the APT detection performance for the case with a large number of CPUs and storage devices. Simulation results show that our proposed reinforcement learning based CPU allocation can improve both the data protection level and the utility of the cloud storage system compared with the Q-learning based CPU allocation against APTs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.