Under the requirement of the modernization of the national governance system and governance capacity, it is an important measure for the government to respond to the demands of the public in the process of urban governance to explore more extensive and more universal means of public participation. With the advent of the Internet era, the communication method of using images as media has made public participation across time and space simple and convenient compared with the background, whereby the participation channels in past urban planning processes could not fully meet the public’s demands. We Media, represented by participatory videos, has had a huge impact on public participation with the help of the widespread influence of the Internet. Using the political analysis framework of “general will—particular will”, it is proposed that coordination between the cognitive level and the practical level is key to evaluate the level of public participation in participatory video intervention in urban planning. AHP and Delphi are used to build the index system. On the basis of adopting a comprehensive evaluation index, a coupled coordination model is introduced to build the public participation evaluation system of urban planning based on the participatory video of ‘general will—particular will’. Through the evaluation of 4770 image samples and 200 survey materials from 11 communities in Xi’an, the index system is found to display good validity. Finally, from the perspective of different stakeholders, the implementation of participatory video intervention in public participation is summarized. This paper has important theoretical value and guiding significance in clarifying the impact of participatory video intervention on public participation in urban and rural planning and promoting the effective improvement of public participation in urban planning.