We consider a multichannel random access system in which each user accesses a single channel at each time slot to communicate with an access point (AP). Users arrive to the system at random and be activated for a certain period of time slots and then disappear from the system. Under such dynamic network environment, we propose a distributed multichannel access protocol based on multi-agent reinforcement learning (RL) to improve both throughput and fairness between active users. Unlike the previous approaches adjusting channel access probabilities at each time slot, the proposed RL algorithm deterministically selects a set of channel access policies for several consecutive time slots. To effectively reduce the complexity of the proposed RL algorithm, we adopt a branching dueling Q-network architecture and propose an efficient training methodology for producing proper Q-values over time-varying user sets. We perform extensive simulations on realistic traffic environments and demonstrate that the proposed online learning improves both throughput and fairness compared to the conventional RL approaches and centralized scheduling policies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.