Efficient use of spectral resources is critical in wireless networks and has been extensively studied in recent years. Dynamic spectrum access (DSA) is one of the key techniques on utilizing the spectral resources. Among them, reinforcement learning (RL) for DSA has gained great attention due to the excellent performance. Limited by the large state space in RL, obtaining the best solution to the spectrum access problem is often computationally expensive. Besides, it is hard to balance multiple objectives of the reward function in RL. To tackle these problems, we explore deep reinforcement learning in a layered framework and propose a hierarchical deep Q-network (h-DQN) model for DSA. The proposed approach divides the original problem into separate "sub problems", each of which is solved using its own reinforcement learning agent. This partitioning simplifies each individual problem, enables modularity, and reduces the complexity of the whole optimization process in the multi-objective case. The performance of Q-learning for dynamic sensing(QADS), deep reinforcement learning for dynamic access (DRLDA), and the proposed h-DQN model is evaluated through simulations. The simulation results show that h-DQN yields better performance with the faster convergence and higher channel utilization than the other two compared methods.INDEX TERMS Dynamic multichannel sensing, deep Q-network, hierarchical reinforcement learning, cognitive radio.
Dynamic spectrum access (DSA) has been considered as a promising technology to address spectrum scarcity and improve spectrum utilization. Normally, the channels are related to each other. Meanwhile, collisions will be inevitably caused by communicating between multiple PUs or multiple SUs in a real DSA environment. Considering these factors, the deep multi-user reinforcement learning (DMRL) is proposed by introducing the cooperative strategy into dueling deep Q network (DDQN). With no demand of prior information about the system dynamics, DDQN can efficiently learn the correlations between channels, and reduce the computational complexity in the large state space of the multi-user environment. To reduce the conflicts and further maximize the network utility, cooperative channel strategy is explored by utilizing the acknowledge (ACK) signals without exchanging spectrum information. In each time slot, each user selects a channel and transmits a packet with a certain probability. After sending, ACK signals are utilized to judge whether the transmission is successful or not. Compared with other popular models, the simulation results show that the proposed DMRL can achieve better performance on effectively enhancing spectrum utilization and reducing conflict rate in the dynamic cooperative spectrum sensing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.