Taking a more quantitative approach in linguistic landscape research, we explore recent techniques of automatic information extraction from images. The recently released Cloud Vision API by Google offers new perspectives on the software-assisted processing and classification of pictures. A software interface makes it possible to extract various kinds of information from pictures automatically, among them the written text, certain labels to describe the picture (e.g. road sign, shop sign, prohibition sign) or the colours used in the picture. Applying this new technique to large-scale image data collections will not only enhance analysis but may also offer hitherto unrecognized structures. The data comes from a large-scale investigation of the Ruhr Metropolis in Germany, where 25,504 photos have been taken to document the linguistic landscape for selected neighbourhoods in four cities (Ziegler et al. 2018). This data has been annotated manually in various categories to analyze the occurrence, form and function of visual multilingualism. These pictures are then automatically processed by the Cloud Vision API and the results compared to the manual annotation. It will show that the quality of the image recognition greatly depends on the quality of the picture. The textual information extracted from the pictures will be stored in a database. Rather than presenting results on the linguistic landscape, this chapter is predominantly concerned with practical tools to facilitate large-scale linguistic landscape research.