Segmenting visual scenes into distinct objects and surfaces is a fundamental visual function. To better understand the underlying neural mechanism, we investigated how neurons in the middle temporal cortex (MT) of macaque monkeys represent overlapping random-dot stimuli moving transparently in slightly different directions. It has been shown that the neuronal response elicited by two stimuli approximately follows the average of the responses elicited by the constituent stimulus components presented alone. In this scheme of response pooling, the ability to segment two simultaneously presented motion directions is limited by the width of the tuning curve to motion in a single direction. We found that, although the population-averaged neuronal tuning showed response averaging, subgroups of neurons showed distinct patterns of response tuning and were capable of representing component directions that were separated by a small angle-less than the tuning width to unidirectional stimuli. One group of neurons preferentially represented the component direction at a specific side of the bidirectional stimuli, weighting one stimulus component more strongly than the other. Another group of neurons pooled the component responses nonlinearly and showed two separate peaks in their tuning curves even when the average of the component responses was unimodal. We also show for the first time that the direction tuning of MT neurons evolved from initially representing the vector-averaged direction of slightly different stimuli to gradually representing the component directions. Our results reveal important neural processes underlying image segmentation and suggest that information about slightly different stimulus components is computed dynamically and distributed across neurons.
Multiple visual stimuli are common in natural scenes, yet it remains unclear how multiple stimuli interact to influence neuronal responses. We investigated this question by manipulating relative signal strengths of two stimuli moving simultaneously within the receptive fields (RFs) of neurons in the extrastriate middle temporal (MT) cortex. Visual stimuli were overlapping random-dot patterns moving in two directions separated by 90°. We first varied the motion coherence of each random-dot pattern and characterized, across the direction tuning curve, the relationship between neuronal responses elicited by bidirectional stimuli and by the constituent motion components. The tuning curve for bidirectional stimuli showed response normalization and can be accounted for by a weighted sum of the responses to the motion components. Allowing nonlinear, multiplicative interaction between the two component responses significantly improved the data fit for some neurons, and the interaction mainly had a suppressive effect on the neuronal response. The weighting of the component responses was not fixed but dependent on relative signal strengths. When two stimulus components moved at different coherence levels, the response weight for the higher-coherence component was significantly greater than that for the lower-coherence component. We also varied relative luminance levels of two coherently moving stimuli and found that MT response weight for the higher-luminance component was also greater. These results suggest that competition between multiple stimuli within a neuron's RF depends on relative signal strengths of the stimuli and that multiplicative nonlinearity may play an important role in shaping the response tuning for multiple stimuli.
Natural scenes often contain multiple objects and surfaces in 3-dimensional space. A fundamental process of vision is to segment visual scenes into distinct objects and surfaces. The stereoscopic depth and motion cues are particularly important for segmentation. However, how the primate visual system represents multiple moving stimuli located at different depths is poorly understood. Here we investigated how neurons in the middle temporal (MT) cortex represented two overlapping surfaces located at different horizontal disparities and moved simultaneously in different directions. We recorded the neuronal activities in MT of three male macaque monkeys while two of them performed a discrimination task to report the motion direction of an attended surface of two overlapping stimuli, and the third animal performed a behavioral task with the attention directed away from the receptive fields of MT neurons. We found that neuronal responses to overlapping surfaces showed a robust bias toward the horizontal disparity of one of the two surfaces. For all animals, the disparity bias in response to two surfaces was positively correlated with the disparity preference of the neurons to single surfaces. For two animals, neurons that preferred the near disparities of single surfaces (near neurons) showed a near bias to overlapping stimuli, and neurons that preferred the far disparities (far neurons) showed a far bias. For another animal, both near and far neurons showed a near bias to overlapping stimuli, although near neurons showed a stronger near bias. The disparity bias to overlapping stimuli was delayed relative to the response onset and was more delayed when the angular separation between two motion directions was smaller. Interestingly, for all three animals, both near and far neurons showed an initial near bias in comparison to the average of the responses to individual surfaces. We also found that the effect of attention directed to the disparity of one of two surfaces was object-based rather than feature-based. Although attention can modulate neuronal response to better represent the attended surface, the disparity bias cannot be explained by attention modulation. Our results can be explained by a unified model with a variable pooling size to weigh the response to individual stimulus components and divisive normalization. Our results revealed the encoding rule for multiple horizontal disparities and motion directions of overlapping stimuli. The disparity bias would allow subgroups of neurons to better represent different surfaces of multiple stimuli and therefore provide a population code that aids segmentation. The tendency for MT neurons to better represent the near-surface of overlapping stimuli in one animal and during the early response period in all three animals suggests that the neural representation of multiple stimuli at different depths may be beneficial to figure-ground segregation since figural objects are more likely to be in front of the ground in natural scenes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.