In this paper, we consider device-to-device enabled uplink cell-free communication between external users with the base station. By exploiting the channel gain differences, external and cellular users are multiplexed into the transmission power domain and then non-orthogonally scheduled for transmission with the same spectrum resources. Successive interference cancellation is then applied at the base station to decode the message signals. We introduce an effective deep reinforcement learning (DRL) scheme to optimise the worst-case user rate through the dynamic power allocation of both external and cellular users. We also compare the performance of the DRL scheme under zero-forcing beamforming and conjugate beamforming methods. Simulation results verify the effectiveness of the DRL method for guaranteeing the user fairness through the worst-case rate maximisation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.