Visual categorization is thought to occur in the human ventral temporal cortex (VTC), but how this categorization is achieved is still largely unknown. In this Review, we consider the computations and representations that are necessary for categorization and examine how the microanatomical and macroanatomical layout of the VTC might optimize them to achieve rapid and flexible visual categorization. We propose that efficient categorization is achieved by organizing representations in a nested spatial hierarchy in the VTC. This spatial hierarchy serves as a neural infrastructure for the representational hierarchy of visual information in the VTC and thereby enables flexible access to category information at several levels of abstraction.
Human ventral temporal cortex (VTC) plays a pivotal role in high-level vision. An under-studied macroanatomical feature of VTC is the mid-fusiform sulcus (MFS), a shallow longitudinal sulcus separating the lateral and medial fusiform gyrus (FG). Here, we quantified the morphological features of the MFS in 69 subjects (ages 7–40), and investigated its relationship to both cytoarchitectonic and functional divisions of VTC with four main findings. First, despite being a minor sulcus, we found that the MFS is a stable macroanatomical structure present in all 138 hemispheres with morphological characteristics developed by age 7. Second, the MFS is the locus of a lateral-medial cytoarchitechtonic transition within the posterior FG serving as the boundary between cytoarchitectonic regions FG1 and FG2. Third, the MFS predicts a lateral-medial functional transition in eccentricity bias representations in children, adolescents, and adults. Fourth, the anterior tip of the MFS predicts the location of a face-selective region, mFus-faces/FFA-2. These findings are the first to illustrate that a macroanatomical landmark identifies both cytoarchitectonic and functional divisions of high-level sensory cortex in humans and have important implications for understanding functional and structural organization in the human brain.
Functional magnetic resonance imaging (fMRI) has identified face- and body part-selective regions, as well as distributed activation patterns for object categories across human ventral temporal cortex (VTC), eliciting a debate regarding functional organization in VTC and neural coding of object categories. Using high-resolution fMRI, we illustrate that face- and limb-selective activations alternate in a series of largely nonoverlapping clusters in lateral VTC along the inferior occipital gyrus (IOG), fusiform gyrus (FG), and occipitotemporal sulcus (OTS). Both general linear model (GLM) and multivoxel pattern (MVP) analyses show that face- and limb-selective activations minimally overlap and that this organization is consistent across experiments and days. We provide a reliable method to separate two face-selective clusters on the middle and posterior FG (mFus and pFus), and another on the IOG using their spatial relation to limb-selective activations and retinotopic areas hV4, VO-1/2, and hMT+. Furthermore, these activations show a gradient of increasing face selectivity and decreasing limb selectivity from the IOG to the mFus. Finally, MVP analyses indicate that there is differential information for faces in lateral VTC (containing weakly- and highly-selective voxels) relative to non-selective voxels in medial VTC. These findings suggest a sparsely-distributed organization where sparseness refers to the presence of several face- and limb-selective clusters in VTC, and distributed refers to the presence of different amounts of information in highly-, weakly-, and non-selective voxels. Consequently, theories of object recognition should consider the functional and spatial constraints of neural coding across a series of nonoverlapping category-selective clusters that are themselves distributed.
SUMMARY Ventral temporal cortex (VTC) is the latest stage of the ventral ‘what’ visual pathway, which is thought to code the identity of a stimulus regardless of its position or size [1, 2]. Surprisingly, recent studies show that position information can be decoded from VTC [3–5]. However, the computational mechanisms by which spatial information is encoded in VTC are unknown. Furthermore, how attention influences spatial representations in human VTC is also unknown because the effect of attention on spatial representations has only been examined in the dorsal ‘where’ visual pathway [6–10]. Here we fill these significant gaps in knowledge using an approach that combines functional magnetic resonance imaging and sophisticated computational methods. We first develop a population receptive field (pRF) model [11, 12] of spatial responses in human VTC. Consisting of spatial summation followed by a compressive nonlinearity, this model accurately predicts responses of individual voxels to stimuli at any position and size, explains how spatial information is encoded, and reveals a functional hierarchy in VTC. We then manipulate attention and use our model to decipher the effects of attention. We find that attention to the stimulus systematically and selectively modulates responses in VTC, but not early visual areas. Locally, attention increases eccentricity, size, and gain of individual pRFs, thereby increasing position tolerance. However, globally, these effects reduce uncertainty regarding stimulus location and actually increase position sensitivity of distributed responses across VTC. These results demonstrate that attention actively shapes and enhances spatial representations in the ventral visual pathway.
Face-selective neural responses in the human fusiform gyrus have been widely examined. However, their causal role in human face perception is largely unknown. Here, we used a multimodal approach of electrocorticography (ECoG), high-resolution functional magnetic resonance imaging (fMRI), and electrical brain stimulation (EBS) to directly investigate the causal role of face-selective neural responses of the fusiform gyrus (FG) in face perception in a patient implanted with subdural electrodes in the right inferior temporal lobe. High-resolution fMRI identified two distinct FG face-selective regions (mFus-faces and pFus-faces). ECoG revealed a striking anatomical and functional correspondence with fMRI data where a pair of face-selective electrodes, positioned one centimeter apart, overlapped mFus-faces and pFus-faces, respectively. Moreover, electrical charge delivered to this pair of electrodes induced a profound face-specific perceptual distortion during viewing of real faces. Specifically, the subject reported a “metamorphosed” appearance of faces of people in the room. Several controls illustrate the specificity of the effect to the perception of faces. EBS of mFus-faces and pFus-faces neither produced a significant deficit in naming pictures of famous faces on the computer, nor did it affect the appearance of nonface objects. Further, the appearance of faces remained unaffected during both sham stimulation and stimulation of a pair of nearby electrodes that were not face-selective. Overall, our findings reveal a striking convergence of fMRI, ECoG, and EBS, which together offer a rare causal link between functional subsets of the human FG network and face perception.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.