The visual attention of pedestrians has been rarely considered in studies of congestion prevention in long-distance passages. This paper proposes a kinetic theory model of human crowds accounting for visual attention to study congestion in long-distance passages. The population is divided into visual attention-shifting pedestrians (VAS pedestrians) and nonvisual attention-shifting pedestrians (non-VAS pedestrians). First, the movement characteristics of all pedestrians are analyzed based on observations and measurements obtained through controlled experiments. Moreover, a pedestrian flow model accounting for visual attention is built to transform the characteristics of pedestrian movement into a mathematical model. Finally, validation is done, and the density and the proportion of VAS pedestrians are selected as congestion warning parameters. Simulations are performed for a subway passage connected to stairs, and the effect of visual attention, the critical thresholds of congestion warning parameters, and the effects of implementing mitigation measures immediately after congestion occurs are assessed. The experimental results show that the movement characteristics of VAS pedestrians and non-VAS pedestrians are different. Simulation results show that the model is effective. Notably, visual attention has an impact on pedestrian movement, and using the density and the proportion of VAS pedestrians as early warning indicators is effective for preventing the occurrence of congestion, as demonstrated by the negative correlation between the two critical thresholds. This description of human groups provides quantitative guidelines for crowd management.