Human object recognition is remarkably efficient. In recent years, significant advancements have been made in our understanding of how the brain represents visual objects and organizes them into categories. Recent studies using pattern analyses methods have characterized a representational space of objects in human and primate inferior temporal cortex in which object exemplars are discriminable and cluster according to category (e.g., faces and bodies). In the present study we examined how category structure in object representations emerges in the first 1000 ms of visual processing. In the study, participants viewed 24 object exemplars with a planned categorical structure comprised of four levels ranging from highly specific (individual exemplars) to highly abstract (animate vs. inanimate), while their brain activity was recorded with magnetoencephalography (MEG). We used a sliding time window decoding approach to decode the exemplar and the exemplar's category that participants were viewing on a moment-to-moment basis. We found exemplar and category membership could be decoded from the neuromagnetic recordings shortly after stimulus onset (<100 ms) with peak decodability following thereafter. Latencies for peak decodability varied systematically with the level of category abstraction with more abstract categories emerging later, indicating that the brain hierarchically constructs category representations. In addition, we examined the stationarity of patterns of activity in the brain that encode object category information and show these patterns vary over time, suggesting the brain might use flexible time varying codes to represent visual object categories.
Recognizing an object takes just a fraction of a second, less than the blink of an eye. Applying multivariate pattern analysis, or “brain decoding”, methods to magnetoencephalography (MEG) data has allowed researchers to characterize, in high temporal resolution, the emerging representation of object categories that underlie our capacity for rapid recognition. Shortly after stimulus onset, object exemplars cluster by category in a high-dimensional activation space in the brain. In this emerging activation space, the decodability of exemplar category varies over time, reflecting the brain’s transformation of visual inputs into coherent category representations. How do these emerging representations relate to categorization behavior? Recently it has been proposed that the distance of an exemplar representation from a categorical boundary in an activation space is critical for perceptual decision-making, and that reaction times should therefore correlate with distance from the boundary. The predictions of this distance hypothesis have been born out in human inferior temporal cortex (IT), an area of the brain crucial for the representation of object categories. When viewed in the context of a time varying neural signal, the optimal time to “read out” category information is when category representations in the brain are most decodable. Here, we show that the distance from a decision boundary through activation space, as measured using MEG decoding methods, correlates with reaction times for visual categorization during the period of peak decodability. Our results suggest that the brain begins to read out information about exemplar category at the optimal time for use in choice behaviour, and support the hypothesis that the structure of the representation for objects in the visual system is partially constitutive of the decision process in recognition.
PurposeA novel phantom for image quality testing for functional magnetic resonance imaging (fMRI) scans is described.MethodsThe cylindrical, rotatable, ~4.5L phantom, with eight wedge-shaped compartments, is used to simulate rest and activated states. The compartments contain NiCl2 doped agar gel with alternating concentrations of agar (1.4%, 1.6%) to produce T1 and T2 values approximating brain grey matter. The Jacard index was used to compare the image distortions for echo planar imaging (EPI) and gradient recalled echo (GRE) scans. Contrast to noise ratio (CNR) was compared across the imaging volume for GRE and EPI.ResultsThe mean T2 for the two agar concentrations were found to be 106.5±4.8, 94.5±4.7 ms, and T1 of 1500±40 and 1485±30 ms, respectively. The Jacard index for GRE was generally found to be higher than for EPI (0.95 versus 0.8). The CNR varied from 20 to 50 across the slices and echo times used for EPI scans, and from 20 to 40 across the slices for the GRE scans. The phantom provided a reproducible CNR over 25 days.ConclusionsThe phantom provides a quantifiable signal change over a head-size imaging volume with EPI and GRE sequences, which was used for image quality assessment.
In a naturalistic environment, auditory cues are often accompanied by information from other senses, which can be redundant with or complementary to the auditory information. Although the multisensory interactions derived from this combination of information and that shape auditory function are seen across all sensory modalities, our greatest body of knowledge to date centers on how vision influences audition. In this review, we attempt to capture the state of our understanding at this point in time regarding this topic. Following a general introduction, the review is divided into 5 sections. In the first section, we review the psychophysical evidence in humans regarding vision’s influence in audition, making the distinction between vision’s ability to enhance versus alter auditory performance and perception. Three examples are then described that serve to highlight vision’s ability to modulate auditory processes: spatial ventriloquism, cross-modal dynamic capture, and the McGurk effect. The final part of this section discusses models that have been built based on available psychophysical data and that seek to provide greater mechanistic insights into how vision can impact audition. The second section reviews the extant neuroimaging and far-field imaging work on this topic, with a strong emphasis on the roles of feedforward and feedback processes, on imaging insights into the causal nature of audiovisual interactions, and on the limitations of current imaging-based approaches. These limitations point to a greater need for machine-learning-based decoding approaches toward understanding how auditory representations are shaped by vision. The third section reviews the wealth of neuroanatomical and neurophysiological data from animal models that highlights audiovisual interactions at the neuronal and circuit level in both subcortical and cortical structures. It also speaks to the functional significance of audiovisual interactions for two critically important facets of auditory perception—scene analysis and communication. The fourth section presents current evidence for alterations in audiovisual processes in three clinical conditions: autism, schizophrenia, and sensorineural hearing loss. These changes in audiovisual interactions are postulated to have cascading effects on higher-order domains of dysfunction in these conditions. The final section highlights ongoing work seeking to leverage our knowledge of audiovisual interactions to develop better remediation approaches to these sensory-based disorders, founded in concepts of perceptual plasticity in which vision has been shown to have the capacity to facilitate auditory learning.
Objects are the fundamental building blocks of how we create a representation of the external world. One major distinction among objects is between those that are animate versus those that are inanimate. In addition, many objects are specified by more than a single sense, yet the nature by which multisensory objects are represented by the brain remains poorly understood. Using representational similarity analysis of male and female human EEG signals, we show enhanced encoding of audiovisual objects when compared with their corresponding visual and auditory objects. Surprisingly, we discovered that the often-found processing advantages for animate objects were not evident under multisensory conditions. This was due to a greater neural enhancement of inanimate objects-which are more weakly encoded under unisensory conditions. Further analysis showed that the selective enhancement of inanimate audiovisual objects corresponded with an increase in shared representations across brain areas, suggesting that the enhancement was mediated by multisensory integration. Moreover, a distance-to-bound analysis provided critical links between neural findings and behavior. Improvements in neural decoding at the individual exemplar level for audiovisual inanimate objects predicted reaction time differences between multisensory and unisensory presentations during a Go/No-Go animate categorization task. Links between neural activity and behavioral measures were most evident at intervals of 100-200 ms and 350-500 ms after stimulus presentation, corresponding to time periods associated with sensory evidence accumulation and decision-making, respectively. Collectively, these findings provide key insights into a fundamental process the brain uses to maximize the information it captures across sensory systems to perform object recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.