Human behavioral experiments with predictive cues and contexts have led to influential conceptualizations of visual attention, such as limited resource mechanisms or weighting by Bayesian priors. How brains might learn to manifest cue/context effects with simple neural architectures is not known. We show that a feedforward convolutional neural network (CNN) with a few million neurons trained on noisy images to detect targets learns to utilize predictive cues and context and predicts human performance for the three most prominent behavioral signatures of covert attention: Posner cueing, set-size effects in search, and contextual cueing. The CNN also approximates a Bayesian ideal observer that has all prior statistical representations of the noise, targets, cues, and context. A larger network pre-trained on natural images and transfer learning can also account for human performance. The findings help understand the neurobiological requirements, computations, and simple neural architectures that lead to covert attention's three landmark behavioral effects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.