A centralised training algorithm with D3QN for scalable regular unmanned ground vehicle formation maintenance

Yuan, Hao; Ni, Jun; Hu, Jibin

doi:10.1049/itr2.12046

Cited by 5 publications

(2 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Traditional Q learning [10] belongs to the value-based algorithm of reinforcement learning algorithm,Q that is,Q (s,a),is the state state at a certain moment,take action a can get the expectation of gain,the environment will feedback the corresponding reward reward according to the agent's action,so the main idea of the algorithm is to build the state and action into a Qtable table to store Q values,and then select the action that can obtain the maximum benefit according to Q values.As the complexity of the environment increases,Qtable is difficult to adapt to tasks with huge state space.2013 DeepMind team [11]proposed the DQN algorithm,which combines deep learning and reinforcement learning for the first time.In 2016,Tom Schaul [12] proposed a preferred experience playback method,which uses temporal difference error to measure the learning value of each experience; secondly,the absolute value of temporal difference error is used to rank the experiences in the experience pool,and the experiences with high deviation are played back more frequently,while the importance sampling weights to correct the high bias problem,which speeds up the training and reduces the convergence difficulty.In 2016 Van Hasselt [13] proposed DoubleDQN to provide a solution to the problem of DQN overestimation,by implementing the selection of actions and the evaluation of actions with different value functions.Wang [14] further proposed DuelingDQN,which The dominance function "normalizes" the Q-Value to the Value baseline,which helps to improve the learning efficiency and make the learning more stable; at the same time,experience shows that the dominance function also helps to reduce the variance,which is an important factor of overfitting.However,D3QN [15] combined with prioritized empirical replay still has shortcomings,and the ability to explore the optimal path is still weak.…”

Section: Introduction To Deep Q Reinforcement Learningmentioning

confidence: 99%

Multi-objective optimization of cloud task scheduling based on improved deep Q-learning algorithm

Guan

Dong

et al. 2023

Preprint

View full text Add to dashboard Cite

With the rapid development of cloud computing technology, although cloud computing can provide flexible resources and pay on demand mode, the problem of high energy consumption is still an urgent problem for cloud service providers to optimize. The cloud task scheduling problem of how to reduce operational costs while improving the quality of cloud services has become a research focus of cloud computing technology. In general, researchers will reduce power consumption, optimize load balancing,improve response speed and other aspects of cloud task scheduling algorithm for multi-objective optimization. At present, there are many optimization algorithms, such as traditional scheduling algorithms and heuristic algorithms, but these algorithms are unable to cope with the dynamic online complex cloud environment, resulting in the failure of multi-objective optimization or the high cost and poor effect of multi-objective optimization. However,deep reinforcement learning shows a better effect on multi-objective optimization of cloud task scheduling. Deep reinforcement learning offers new research ideas for cloud task scheduling. In this paper, based on the D3QN algorithm and noise network, NoisyD3QN algorithm is improved,which increases the exploration ability of reinforcement learning model and increases the upper limit of performance. Meanwhile, the NoisyD3QN algorithm-based cloud task scheduling model is constructed using the power consumption model proposed by FAN for simulation experiments. Comparing the performance of the eight algorithms on different optimization targets, the experiments show that the proposed algorithm effectively reduces the cluster power consumption, improves response speed, reduces user waiting time, and optimizes load balancing, Significantly reduces the average standard deviation of cluster CPU utilization.

show abstract

Section: Introduction To Deep Q Reinforcement Learningmentioning

confidence: 99%

Multi-objective optimization of cloud task scheduling based on improved deep Q-learning algorithm

Guan

Dong

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Therefore, many researchers used deep reinforcement learning methods to study vehicle decisionmaking problems. [12][13][14] The deep reinforcement learning methods replaces the value function of the continuous state space with a neural network, and outputs discrete (such as DQN algorithm) or continuous actions (such as DDPG algorithm), which reduces computer memory requirement, improves decision-making accuracy and training efficiency, and improves traffic efficiency and safety, reduces fuel consumption reduction. Studies have shown that deep reinforcement learning methods are more excellent than traditional methods, [15][16][17] and can get good effects in complex environments.…”

Section: Introductionmentioning

confidence: 99%

Driver-like decision-making method for vehicle longitudinal autonomous driving based on deep reinforcement learning

Gao

Yan

Gao

et al. 2021

Proceedings of the Institution of Mechanical Engineers, Part D:

View full text Add to dashboard Cite

Decision-making is one of the key parts of the research on vehicle longitudinal autonomous driving. Considering the behavior of human drivers when designing autonomous driving decision-making strategies is a current research hotspot. In longitudinal autonomous driving decision-making strategies, traditional rule-based decision-making strategies are difficult to apply to complex scenarios. Current decision-making methods that use reinforcement learning and deep reinforcement learning construct reward functions designed with safety, comfort, and economy. Compared with human drivers, the obtained decision strategies still have big gaps. Focusing on the above problems, this paper uses the driver’s behavior data to design the reward function of the deep reinforcement learning algorithm through BP neural network fitting, and uses the deep reinforcement learning DQN algorithm and the DDPG algorithm to establish two driver-like longitudinal autonomous driving decision-making models. The simulation experiment compares the decision-making effect of the two models with the driver curve. The results shows that the two algorithms can realize driver-like decision-making, and the consistency of the DDPG algorithm and human driver behavior is higher than that of the DQN algorithm, the effect of the DDPG algorithm is better than the DQN algorithm.

show abstract

Deep Reinforcement Learning based Intrusion Detection System with Feature Selections Method and Optimal Hyper-parameter in IoT Environment

Bakhshad

Ponnusamy

Annur

et al. 2022

2022 International Conference on Computer, Information and Telecommunication Systems (CITS)

View full text Add to dashboard Cite

A centralised training algorithm with D3QN for scalable regular unmanned ground vehicle formation maintenance

Cited by 5 publications

References 39 publications

Multi-objective optimization of cloud task scheduling based on improved deep Q-learning algorithm

Multi-objective optimization of cloud task scheduling based on improved deep Q-learning algorithm

Driver-like decision-making method for vehicle longitudinal autonomous driving based on deep reinforcement learning

Deep Reinforcement Learning based Intrusion Detection System with Feature Selections Method and Optimal Hyper-parameter in IoT Environment

Contact Info

Product

Resources

About