Finding the optimal signal timing strategy poses a formidable challenge in traffic signal control problems. While multi-agent reinforcement learning holds promise for addressing this issue, the majority of studies prioritize vehicle-centric approaches, neglecting crucial pedestrian factors, particularly the unpredictable behavior of pedestrians during street crossings. This research aims to bridge this gap by investigating the effectiveness of a novel Double Q-learning method based on multi-agent collaboration in optimizing the traffic balance between pedestrians and vehicles at multiple intersections. In the context of urban transportation systems, where intersections act as pivotal nodes, this study adopts a macro-level perspective, treating each intersection as an agent capable of collaborative decision-making. The micro-level analysis incorporates a state representation incorporating pedestrian behavior sensitivity and a flexible action space, with the objective of optimizing traffic conditions. Results demonstrate that, under non-ideal conditions, our algorithm outperforms traditional Sarsa and Q-Learning algorithms, reducing vehicle waiting time by 32.8% and pedestrian waiting time by 70.85%. These outcomes underscore the algorithm’s effectiveness in balancing traffic efficiency for both vehicles and pedestrians.