Drosophila melanogaster are known to live in a social but cryptic world of touch and odours, but the extent to which they can perceive and integrate visual information is a hotly debated topic. Some researchers fixate on the limited resolution of D. melanogaster 's optics, other's on their seemingly identical appearance; yet there is evidence of individual recognition and surprising visual learning in flies. Here, we apply machine learning and show that individual D. melanogaster are visually distinct. We also use the striking similarity of Drosophila's visual system to current convolutional neural networks to theoretically investigate D. melanogaster 's capacity for visual understanding. We find that, despite their limited optical resolution, D. melanogaster 's neuronal architecture has the capability to extract and encode a rich feature set that allows flies to re-identify individual conspecifics with surprising accuracy. These experiments provide a proof of principle that Drosophila inhabit in a much more complex visual world than previously appreciated.
Author summaryIn this paper, we determine a proof of principle for inter-individual recognition in two parts; is there enough information contained in low resolution pictures for inter-fly discrimination, and if so does Drosophila's visual system have enough capacity to use it. We show that the information contained in a 29×29 pixel image (number of ommatidia in a fly eye) is sufficient to achieve 94% accuracy in fly re-identification. Further, we show that the fly eye has the theoretical capacity to identify another fly with about 75% accuracy. Although it is unlikely that flies use the exact algorithm we tested, our results show that, in principle, flies may be using visual perception in ways that are not usually appreciated. 6 only˜850 lens units (ommatidia), each capturing a single point in space, Drosophila's 7 June 5, 2018 1/12 40 non-familiar conspecifics in social situations [3]. 41 How a fly's visual system could extract meaning out of low resolution images is 42 suggested by the highly structured and layered organization of Drosophila's visual 43 system (Fig. 2C). At the input the ommatidia are packed one by one, but their 44 individually-tuned photoreceptors are arranged spatially to essentially convolve a 6-unit 45 filter across the receptive field. The output of this photoreceptor filter is, in turn, the 46 input for downstream medulla neurons that connect to several 'columns' of 47 photoreceptor outputs. This filter-convolution and using the output of one filter as a 48 'feature map' for another layer is a hallmark of the engineered architectures of DCNs 49that dominate computer vision today (one such DCN is illustrated in Fig. 2A). Just as 50 DCNs can take low level image representations and encode them into a semantic 51 June 5, 2018 2/12 71 therefore we were specifically limited to 'modular' neurons (with 1 neuron/column) 72 throughout the medulla [17]. The connections between neuronal types was extracted 73 from published connectomes [18]. We...