Cell-free massive multiple-input multiple-output (CF-mMIMO) has been considered as one of the potential technologies for beyond-5G and 6G to meet the demand for higher data capacity and uniform service rate for user equipment. However, reusing the same pilot signals by several users, owing to limited pilot resources, can result in the so-called pilot contamination problem, which can prevent CF-mMIMO from unlocking its full performance potential. It is challenging to employ classical pilot assignment (PA) methods to serve many users simultaneously with low complexity; therefore, a scalable and distributed PA scheme is required. In this paper, we utilize a learning-based approach to handle the pilot contamination problem by formulating PA as a multi-agent static game, developing a two-level hierarchical learning algorithm to mitigate the effects of pilot contamination, and presenting an efficient yet scalable PA strategy. We first model a PA problem as a static multi-agent game with P teams (agents), in which each team is represented by a specific pilot. We then define a multi-agent structure that can automatically determine the most appropriate PA policy in a distributed manner. The numerical results demonstrate that the proposed PA algorithm outperforms previous suboptimal algorithms in terms of the per-user spectral efficiency (SE). In particular, the proposed approach can increase the average SE and 95%-likely SE by approximately 2.2% and 3.3%, respectively, compared to the best state-of-the-art solution.INDEX TERMS Cell-free massive MIMO, deep reinforcement learning, pilot assignment, pilot contamination, spectral efficiency.
A cell-free massive multiple-input multiple-output (MIMO) uplink is investigated in this paper. We address a power allocation design problem that considers two conflicting metrics, namely the sum rate and fairness. Different weights are allocated to the sum rate and fairness of the system, based on the requirements of the mobile operator. The knowledge of the channel statistics is exploited to optimize power allocation. We propose to employ large scale-fading (LSF) coefficients as the input of a twin delayed deep deterministic policy gradient (TD3). This enables us to solve the non-convex sum rate fairness trade-off optimization problem efficiently. Then, we exploit a use-and-then-forget (UatF) technique, which provides a closedform expression for the achievable rate. The sum rate fairness trade-off optimization problem is subsequently solved through a sequential convex approximation (SCA) technique. Numerical results demonstrate that the proposed algorithms outperform conventional power control algorithms in terms of both the sum rate and minimum user rate. Furthermore, the TD3-based approach can increase the median of sum rate by 16%-46% and the median of minimum user rate by 11%-60% compared to the proposed SCA-based technique. Finally, we investigate the complexity and convergence of the proposed scheme. cc Index terms-Cell-free massive MIMO, deep reinforcement learning, fairness, power control, sequential convex approximation.
The uplink of a cell-free massive multiple-input multiple-output with maximum-ratio combining (MRC) and zero-forcing (ZF) schemes are investigated. A power allocation optimization problem is considered, where two conflicting metrics, namely the sum rate and fairness, are jointly optimized. As there is no closed-form expression for the achievable rate in terms of the large scale-fading (LSF) components, the sum rate fairness trade-off optimization problem cannot be solved by using known convex optimization methods. To alleviate this problem, we propose two new approaches. For the first approach, a use-and-then-forget scheme is utilized to derive a closed-form expression for the achievable rate. Then, the fairness optimization problem is iteratively solved through the proposed sequential convex approximation (SCA) scheme. For the second approach, we exploit LSF coefficients as inputs of a twin delayed deep deterministic policy gradient (TD3), which efficiently solves the non-convex sum rate fairness trade-off optimization problem. Next, the complexity and convergence properties of the proposed schemes are analyzed. Numerical results demonstrate the superiority of the proposed approaches over conventional power control algorithms in terms of the sum rate and minimum user rate for both the ZF and MRC receivers. Moreover, the proposed TD3-based power control achieves better performance than the proposed SCA-based approach as well as the fractional power scheme. Index Terms-Cell-free massive MIMO, deep reinforcement learning, fairness, power control, sequential convex approximation. I. IntroductionThe vast development of mobile communication networks and the number of supported devices have imposed increasing demands for much higher data-rate mobile communication. Cell-free massive multiple-input multipleoutput (mMIMO) is a key enabling wireless network technology as it greatly increases coverage probability and the data rate [2]. In cell-free mMIMO, a large number of
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.