Due to the rapid development of network communication technology and the significant increase in network terminal equipment, the application of new network architecture software-defined networking (SDN) combined with reinforcement learning in network traffic scheduling has become an important focus of research. Because of network traffic transmission variability and complexity, the traditional reinforcement-learning algorithms in SDN face problems such as slow convergence rates and unbalanced loads. The problems seriously affect network performance, resulting in network link congestion and the low efficiency of inter-stream bandwidth allocation. This paper proposes an automatic load-balancing architecture based on reinforcement learning (ALBRL) in SDN. In this architecture, we design a load-balancing optimization model in high-load traffic scenarios and adapt the improved Deep Deterministic Policy Gradient (DDPG) algorithm to find a near-optimal path between network hosts. The proposed ALBRL uses the sampling method of updating the experience pool with the SumTree structure to improve the random extraction strategy of the empirical-playback mechanism in DDPG. It extracts a more meaningful experience for network updating with greater probability, which can effectively improve the convergence rate. The experiment results show that the proposed ALBRL has a faster training speed than existing reinforcement-learning algorithms and significantly improves network throughput.
Software-defined networking (SDN) has become one of the critical technologies for data center networks, as it can improve network performance from a global perspective using artificial intelligence algorithms. Due to the strong decision-making and generalization ability, deep reinforcement learning (DRL) has been used in SDN intelligent routing and scheduling mechanisms. However, traditional deep reinforcement learning algorithms present the problems of slow convergence rate and instability, resulting in poor network quality of service (QoS) for an extended period before convergence. Aiming at the above problems, we propose an automatic QoS architecture based on multistep DRL (AQMDRL) to optimize the QoS performance of SDN. AQMDRL uses a multistep approach to solve the overestimation and underestimation problems of the deep deterministic policy gradient (DDPG) algorithm. The multistep approach uses the maximum value of the n-step action currently estimated by the neural network instead of the one-step Q-value function, as it reduces the possibility of positive error generated by the Q-value function and can effectively improve convergence stability. In addition, we adapt a prioritized experience sampling based on SumTree binary trees to improve the convergence rate of the multistep DDPG algorithm. Our experiments show that the AQMDRL we proposed significantly improves the convergence performance and effectively reduces the network transmission delay of SDN over existing DRL algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.