Accurate traffic forecasting is more necessary than ever for transportation departments, especially given its significant role in traffic planning, management, and control. However, most existing methods struggle to address complex spatial correlations on road networks, nonlinear temporal dynamics, and difficult long‐term prediction. This article proposes a novel spatial temporal graph gated transformer (STGGT) to overcome these challenges. The suggested model differs from Google's transformer because it uses a hybrid architecture that integrates graph convolutional networks (GCNs), attention, and gated recurrent units (GRUs) instead of solely relying on attention. Specifically, STGGT uses GCNs to extract spatial dependencies, utilizes attention and GRUs to extract temporal dependencies, and handle long‐term prediction. Experiments indicate that STGGT outperforms the state‐of‐the‐art baseline models on two real‐world traffic datasets of 9%–40%. The proposed model offers a promising solution for accurate traffic forecasting, simultaneously addressing the challenges of complex spatial correlations, nonlinear temporal dynamics, and long‐term prediction.