When probed with complex stimuli that extend beyond their classical receptive field, neurons in primary visual cortex display complex and non-linear response characteristics. Sparse coding models reproduce some of the observed contextual effects, but still fail to provide a satisfactory explanation in terms of realistic neural structures and cortical mechanisms, since the connection scheme they propose consists only of interactions among neurons with overlapping input fields. Here we propose an extended generative model for visual scenes that includes spatial dependencies among different features. We derive a neurophysiologically realistic inference scheme under the constraint that neurons have direct access only to local image information. The scheme can be interpreted as a network in primary visual cortex where two neural populations are organized in different layers within orientation hypercolumns that are connected by local, short-range and long-range recurrent interactions. When trained with natural images, the model predicts a connectivity structure linking neurons with similar orientation preferences matching the typical patterns found for long-ranging horizontal axons and feedback projections in visual cortex. Subjected to contextual stimuli typically used in empirical studies, our model replicates several hallmark effects of contextual processing and predicts characteristic differences for surround modulation between the two model populations. In summary, our model provides a novel framework for contextual processing in the visual system proposing a well-defined functional role for horizontal axons and feedback projections.
A central goal in visual neuroscience is to understand computational mechanisms and to identify neural structures responsible for integrating local visual features into global representations. When probed with complex stimuli that extend beyond their classical receptive field, neurons display non-linear behaviours indicative of such integration processes already in early stages of visual processing. Recently some progress has been made in explaining these effects from first principles by sparse coding models with a neurophysiologically realistic inference dynamics. They reproduce some of the complex response
Author summaryAn influential hypothesis about how the brain processes visual information posits that each given stimulus should be efficiently encoded using only a small number of cells. This idea led to the development of a class of models that provided a functional explanation for various response properties of visual neurons, including the non-linear modulations observed when localized stimuli are placed in a broader spatial context. However, it remains to be clarified through which anatomical structures and neural connectivities a network in the cortex could perform the computations that these models require. In this paper we propose a model for encoding spatially extended visual scenes. Imposing the constraint that neurons in visual cortex have direct access only to small portions of the visual field we derive a simple yet realistic neural population dynamics. Connectivities optimized for natural scenes conform with anatomical findings and the resulting model reproduces a broad set of physiological observations, while exposing the neural mechanisms relevant for spatio-temporal information integration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.