With the increasing traffic congestion in cities, the priority of public transit has become a consensus for the development and management of urban transportation. The traffic pre-signal mechanism, which gives priority in time and space to buses by optimizing road right-of-way allocation, has gained wide attention and application. In order to broaden the action exploration range of the agent and avoid the pre-signal decision from falling into suboptimal strategy or local optimal strategy. For the exploration strategy of the DDQN algorithm, this paper reduces the probability of directly selecting the local optimal action and increases the probability of selecting non-greedy actions based on the principle that “the action with a larger value function is more likely to be selected.” This paper addresses the problem that the existing urban traffic pre-signal mechanism cannot adaptively adjust the advance time, and proposes a traffic pre-signal adaptive timing mechanism based on a Hybrid Exploration Strategy Double Deep Q Network (HES-DDQN) by combining the $$\epsilon $$
ϵ
-greedy strategy and Boltzmann strategy. We have used the traffic simulation software VISSIM to conduct simulation experiments on an intersection. The experimental results show that, compared with the method of setting no pre-signal and the formula method of setting pre-signal, the HES-DDQN pre-signal mechanism can significantly reduce the average delay of buses, the waiting queue length, and the number of stops of social vehicles.