The combination of reinforcement learning and platooning control has been widely studied, which has been considered to be altruistic on the basis of safety. However, the platooning control method based on reinforcement learning is not mature. This paper proposes a platoon sharing deep deterministic policy gradient algorithm (PSDDPG) for a multi‐vehicle network which overcomes the problem of low efficiency of continuous action space exploration. On the basis of the noise of deep deterministic policy gradient (DDPG) algorithm, additional platoon noise is added to enhance the diversity of training samples during exploration and avoid using a single‐vehicle data, which could weaken model robustness. The method of using the time sequence information as an input and replay buffer backup is proposed to prevent the problems of insufficient exploration and low sampling efficiency from deteriorating the training effect. The robustness of the proposed algorithm is verified by tests using the Carla simulator. Also, the platoon merging, overtaking, cruising following and obstacle avoidance control of a vehicle platoon is realized under typical traffic flow conditions. The experimental results show that the PSDDPG algorithm can provide an efficient control strategy and thus has the potential to reduce energy consumption and improve road efficiency.
Cooperative adaptive cruise control (CACC) realizes efficient, intelligent control of vehicle acceleration, deceleration, and steering, through inter‐vehicle communication and cooperative control. However, the close combination of the platoon makes it difficult for other vehicles to cut‐in, which can lead to severe traffic jams on certain sections of the road. The control effect of the CACC depends on the platoon penetration rate, which is the percentage of connected and autonomous vehicles (CAVs) in the total number of platoon members. There is no quantitative control method for different penetration rates, and it is difficult to quantify the impact of CACC vehicles on traffic. Therefore, this paper proposes an innovative CACC control method based on deep reinforcement learning (DRL). First, the altruism control and the quantitative control of the car‐following strategy are realized by the virtual car‐following distance method to reduce the exclusivity of the CACC platoon or improve the road utilization efficiency. Second, a more appropriate platoon reward function and collision avoidance method are proposed. Finally, the Car Learning to Act (CARLA) simulator is used. The obtained results confirm that the CACC control of CAVs based on DRL can absorb speed oscillation and improve fuel economy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.