Abstract. 360° imagery has been increasingly used to estimate the subjective qualities of the urban space, such as the feeling of safety or the liveliness of a place. These spherical panoramas offer an immersive view of the urban scene, close to the experience of a pedestrian. In recent years, Deep Learning approaches have been developed for this estimation task, only using flat images because these images are easier to annotate and process with standard CNNs. Thus to qualify the whole urban space, the panoramic images are divided into four flat sub-images that can be processed by the trained neural networks. The sub-images cover the 360° field of view, e.g. front, back, left, and right views. The four scores obtained are averaged to represent the level of the quality at the location of the panorama. However, this split introduces a bias since some elements of the urban space are halved over two images and the global context is lost. Based on the Place Pulse 2.0 dataset, this paper investigates the impact of splitting 360° panoramas on the perceptual scores predicted by neural networks. For each panorama, we predict the score for thirty-six overlapping sub-images. The scores were shown to have high variability and be highly dependent on the direction of the camera for the perspective images. This indicates that four images are not sufficient to capture the complexity of the perceptual qualities of the urban space.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.