Propelling, and propelled by, the "deep learning revolution", recent years have seen the introduction of ever larger corpora of images annotated with natural language expressions. We survey some of these corpora, taking a perspective that reverses the usual directionality, as it were, by viewing the images as semantic annotation of the natural language expressions. We discuss datasets that can be derived from the corpora, and tasks of potential interest for computational semanticists that can be defined on those. In this, we make use of relations provided by the corpora (namely, the link between expression and image, and that between two expressions linked to the same image) and relations that we can add (similarity relations between expressions, or between images). Specifically, we show that in this way we can create data that can be used to learn and evaluate lexical and compositional grounded semantics, and we show that the "linked to same image" relation tracks a semantic implicature relation that is recognisable to annotators even in the absence of the linking image as evidence. Finally, as an example of possible benefits of this approach, we show that an exemplar-model-based approach to implicature beats a (simple) distributional space-based one on some derived datasets, while lending itself to explainability. * Work done while author was at Bielefeld University. 1 There are interesting subtleties here. In our everyday language, we are quite good at ignoring the image layer, and say things like "the woman is using a computer", instead of "the image shows a woman using a computer", or "this is a computer", instead of "this is an image of a computer". This also seems to carry over to tense, where we can say "is using", instead of "was using at the time when the picture was taken". There are however contexts in which talk about the image as image is relevant, and this can happen in large corpora such as discussed here. So this is something to keep in mind.