With the popularity of watching mobile videos, a major form of multimedia content, many works focus on the geographic features of user viewing behaviors, but few study them in the context of an entire metropolitan city. Different regions of a large city have different intensity of economy activities with respect to their different distances to the downtown, and how this will influence video popularity and similarity is still unclear. To quantitatively study the spatial popularity and similarity of watching videos in a large urban environment, we collect a dataset with two-month video view requests from the largest network provider in Shanghai, containing top six content providers, and study the spatial features of video access in regions of different scales. We find that 1) video popularity and similarity exist at different scales of city division; 2) the concentration of video popularity becomes higher as the region is closer to downtown; 3) when comparing the regions of same scale, the similarity of popular videos becomes lower as the region is farther away from the downtown. Finally, we correlate our findings with cache deployment, advertising and video recommendation to illustrate the implications.