“…DCA enhances the efficiency and flexibility of wireless networks, ensuring efficient channel utilization and mitigating the impact of interference, leading to improved overall performance and quality of service (QoS) . Recent studies on this topic, which involves learning the environment and channel allocation algorithms, have employed frameworks such as multi-armed bandits (MAB) [1]- [9], stable matching [10], [11], game theoretic optimization and congestion control [1], [8], [12]- [21], and, more recently, deep reinforcement learning (DRL) for multiuser scenarios [22]- [36]. The latter was initially explored in our earlier work (Naparstek and Cohen [22], [23]) within a multi-agent framework, following single-agent DRL research in [37], paving the way for a significant amount of subsequent research in the wireless communications community.…”