In this study, we compared cognitive map formation of small-scale models of city-like environments presented in visual or tactile/haptic modalities. Previous research often addresses only a limited amount of cognitive map aspects. We wanted to combine several of these aspects to elucidate a more complete view. Therefore, we assessed different types of spatial information, and consider egocentric as well as allocentric perspectives. Furthermore, we compared haptic map learning with visual map learning. In total 18 sighted participants (9 in a haptic condition, 9 visuo-haptic) learned three tactile maps of city-like environments. The maps differed in complexity, and had five marked locations associated with unique items. Participants estimated distances between item pairs, rebuilt the map, recalled locations, and navigated two routes, after learning each map. All participants overall performed well on the spatial tasks. Interestingly, only on the complex maps, participants performed worse in the haptic condition than the visuo-haptic, suggesting no distinct advantage of vision on the simple map. These results support ideas of modality-independent representations of space. Although it is less clear on the more complex maps, our findings indicate that participants using only haptic or a combination of haptic and visual information both form a quite accurate cognitive map of a simple tactile city-like map.