Combining sensory inputs over time is fundamental to seeing. Due to temporal integration, we do not perceive the flicker in fluorescent lights nor the discrete sampling of movie frames; instead we see steady illumination and continuous motion. As a result of adaptation, elements of a scene that suddenly change in appearance are more salient than elements that do not. Here we investigated how the human nervous system combines visual information over time, measuring both functional MRI and intracortical EEG. We built predictive models using canonical neural computations, and account for temporal integration and adaptation. The models capture systematic differences in how information is combined in different visual areas, and generalize across instruments, subjects, and stimuli.
AbstractThe visual system analyzes image properties across multiple spatial and temporal scales. Population receptive field ("pRF") models have successfully characterized spatial representations across the human visual pathways. Here, we studied temporal representations, measuring fMRI and electrocorticographic ("ECoG") responses in posterior, lateral, ventral, and dorsal visual areas to briefly viewed contrast patterns. We built a temporal pRF model employing linear summation and time-varying divisive normalization. Our model accurately predicts the fMRI amplitude and ECoG broadband timecourse, accounting for two phenomena -accumulation of stimulus information over time (summation), and response reduction with prolonged or repeated exposure (adaptation). We find systematic differences in these properties: summation periods are increasingly long and adaptation more pronounced in higher compared to earlier visual areas. We propose that several features of temporal responses -adaptation, summation, and the timescale of temporal dynamics -can be understood as resulting from a small number of canonical neuronal computations.