The model-based power allocation algorithm has been investigated for decades, but it requires the mathematical models to be analytically tractable and it usually has high computational complexity.Recently, the data-driven model-free machine learning enabled approaches are being rapidly developed to obtain near-optimal performance with affordable computational complexity, and deep reinforcement learning (DRL) is regarded as of great potential for future intelligent networks. In this paper, the DRL approaches are considered for power control in multi-user wireless communication cellular networks.Considering the cross-cell cooperation, the off-line/on-line centralized training and the distributed execution, we present a mathematical analysis for the DRL-based top-level design. The concrete DRL design is further developed based on this foundation, and policy-based REINFORCE, value-based deep Q learning (DQL), actor-critic deep deterministic policy gradient (DDPG) algorithms are proposed.Simulation results show that the proposed data-driven approaches outperform the state-of-art modelbased methods on sum-rate performance, with good generalization power and faster processing speed.Furthermore, the proposed DDPG outperforms the REINFORCE and DQL in terms of both sum-rate performance and robustness, and can be incorporated into existing resource allocation schemes due to its generality.Deep reinforcement learning, deep deterministic policy gradient, policy-based, interfering multipleaccess channel, power control, resource allocation. I. INTRODUCTIONWireless data transmission has experienced tremendous growth in past years and will continue to grow in the future. When large numbers of terminals such as mobile phones and wearable devices are connected to the networks, the density of access point (AP) will have to be increased. Dense deployment of small cells such as pico-cells, femto-cells, has become the most effective solution to accommodate the critical demand for spectrum [1]. With denser APs and smaller cells, the whole communication network is flooded with wireless signals, and thus the intra-cell and inter-cell interference problems are severe [2]. Therefore, power allocation and interference management are crucial and challenging [3], [4].Massive model-oriented algorithms have been developed to cope with interference management [5]- [9], and the existing studies mainly focus on sub-optimal or heuristic algorithms, whose performance gaps to the optimal solution are typically difficult to quantify. Besides, the mathematical models are usually assumed to be analytically tractable, but these models are not always accurate because both hardware and channel imperfections can exist in practical communication environments. When considering specific hardware components and realistic transmission scenarios, such as low-resolution A/D, nonlinear amplifier and user distribution, the signal processing techniques with model-driven tools are challenging to be developed. Moreover, the computational complexity of these algorithms i...
The problem of beam alignment and tracking in high mobility scenarios such as high-speed railway (HSR) becomes extremely challenging, since large overhead cost and significant time delay are introduced for fast time-varying channel estimation. To tackle this challenge, we propose a learning-aided beam prediction scheme for HSR networks, which predicts the beam directions and the channel amplitudes within a period of future time with fine time granularity, using a group of observations. Concretely, we transform the problem of highdimensional beam prediction into a two-stage task, i.e., a lowdimensional parameter estimation and a cascaded hybrid beamforming operation. In the first stage, the location and speed of a certain terminal are estimated by maximum likelihood criterion, and a data-driven data fusion module is designed to improve the final estimation accuracy and robustness. Then, the probable future beam directions and channel amplitudes are predicted, based on the HSR scenario priors including deterministic trajectory, motion model, and channel model. Furthermore, we incorporate a learnable non-linear mapping module into the overall beam prediction to allow non-linear tracks. Both of the proposed learnable modules are model-based and have a good interpretability. Compared to the existing beam management scheme, the proposed beam prediction has (near) zero overhead cost and time delay. Simulation results verify the effectiveness of the proposed scheme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.