Advances in mobile communication capabilities open the door for closer integration of pre-hospital and inhospital care processes. For example, medical specialists can be enabled to guide on-site paramedics and can, in turn, be supplied with live vitals or visuals. Consolidating such performance-critical applications with the highly complex workings of mobile communications requires solutions both reliable and efficient, yet easy to integrate with existing systems. This paper explores the application of Deep Deterministic Policy Gradient (DDPG) methods for learning a communications resource scheduling algorithm with special regards to priority users. Unlike the popular Deep-Q-Network methods, the DDPG is able to produce continuous-valued output. With light post-processing, the resulting scheduler is able to achieve high performance on a flexible sum-utility goal.
With increasing complexity of modern communication systems, Machine Learning (ML) algorithms have become a focal point of research. However, performance demands have tightened in parallel to complexity. For some of the key applications targeted by future wireless, such as the medical field, strict and reliable performance guarantees are essential, but vanilla ML methods have been shown to struggle with these types of requirements. Therefore, the question is raised whether these methods can be extended to better deal with the demands imposed by such applications. In this paper, we look at a combinatorial Resource Allocation (RA) challenge with rare, significant events which must be handled properly. We propose to treat this as a multi-task learning problem, select two methods from this domain, Elastic Weight Consolidation (EWC) and Gradient Episodic Memory (GEM), and integrate them into a vanilla actor-critic scheduler. We compare their performance in dealing with Black Swan Events with the state-of-the-art of augmenting the training data distribution and report that the multi-task approach proves highly effective.
Questions remain on the robustness of data-driven learning methods when crossing the gap from simulation to reality. We utilize weight anchoring, a method known from continual learning, to cultivate and fixate desired behavior in Neural Networks. Weight anchoring may be used to find a solution to a learning problem that is nearby the solution of another learning problem. Thereby, learning can be carried out in optimal environments without neglecting or unlearning desired behavior. We demonstrate this approach on the example of learning mixed QoS-efficient discrete resource scheduling with infrequent priority messages. Results show that this method provides performance comparable to the state of the art of augmenting a simulation environment, alongside significantly increased robustness and steerability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.