Previous studies in that line suggested that lateral interactions of V1 cells are responsible, among other visual effects, of bottom-up visual attention (alternatively named visual salience or saliency). Our objective is to mimic these connections in the visual system with a neurodynamic network of firing-rate neurons. Early subcortical processes (i.e. retinal and thalamic) are functionally simulated. An implementation of the cortical magnification function is included to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search conditions. Results show that our model outpeforms other biologically-inpired models of saliency prediction as well as to predict visual saccade sequences during free viewing. We also show how temporal and spatial characteristics of inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) predict attention at distinct image contexts.
Author summarySaliency maps are the representations of how certain visual regions attract attention in a visual scene, and these can be measured with eye movements. A myriad of computational models with artificial and biological inspiration have been able to acquire outstanding predictions of human fixations. However, most of these models have been built specifically for visual saliency, a characteristic that denies their biological plausibility for modeling distinct visual processing mechanisms or other visual processes simultaneously. In addition to saliency, our approach is also able to efficiently work for other tasks (without applying any type of training or optimization and keeping the same parametrization) such as Visual Search, Visual Discomfort [1], Brightness [2] and Color Induction [3]. By performing simulations of human physiology and its mechanisms, we propose to build a unified model that could be extended to predict and understand distinct perceptual processes in which V1 is responsible.2 discriminate redundant information [4][5][6]. In order to filter or select the information to 3 May 3, 2019 1/32 be processed in higher areas of visual processing in the brain, the HVS guides eye 4 movements towards regions that appear to be visually conspicuous or distinct in the 5 scene. This phenomena was observed during visual search tasks [7, 8], where detecting 6 early visual features (such as orientation, color or size) was done in parallel 7 (pre-attentively) or required either a serial "binding" step depending on scene context. 8 Koch & Ullman [9] came up with the hypothesis that neuronal mechanisms involved in 9 selective visual attention generate a unique "master" map from visual scenes, coined 10 with the term "saliency map". From that, Itti, Koch & Ullman [10] presented a 11 computational i...