Humans are remarkably efficient at categorizing natural scenes. In fact, scene categories can be decoded from functional MRI (fMRI) data throughout the ventral visual cortex, including the primary visual cortex, the parahippocampal place area (PPA), and the retrosplenial cortex (RSC). Here we ask whether, and where, we can still decode scene category if we reduce the scenes to mere lines. We collected fMRI data while participants viewed photographs and line drawings of beaches, city streets, forests, highways, mountains, and offices. Despite the marked difference in scene statistics, we were able to decode scene category from fMRI data for line drawings just as well as from activity for color photographs, in primary visual cortex through PPA and RSC. Even more remarkably, in PPA and RSC, error patterns for decoding from line drawings were very similar to those from color photographs. These data suggest that, in these regions, the information used to distinguish scene category is similar for line drawings and photographs. To determine the relative contributions of local and global structure to the human ability to categorize scenes, we selectively removed long or short contours from the line drawings. In a category-matching task, participants performed significantly worse when long contours were removed than when short contours were removed. We conclude that global scene structure, which is preserved in line drawings, plays an integral part in representing scene categories.
Within the range of images that we might categorize as a “beach”, for example, some will be more representative of that category than others. Here we first confirmed that humans could categorize “good” exemplars better than “bad” exemplars of six scene categories and then explored whether brain regions previously implicated in natural scene categorization showed a similar sensitivity to how well an image exemplifies a category. In a behavioral experiment participants were more accurate and faster at categorizing good than bad exemplars of natural scenes. In an fMRI experiment participants passively viewed blocks of good or bad exemplars from the same six categories. A multi-voxel pattern classifier trained to discriminate among category blocks showed higher decoding accuracy for good than bad exemplars in the PPA, RSC and V1. This difference in decoding accuracy cannot be explained by differences in overall BOLD signal, as average BOLD activity was either equivalent or higher for bad than good scenes in these areas. These results provide further evidence that V1, RSC and the PPA not only contain information relevant for natural scene categorization, but their activity patterns mirror the fundamentally graded nature of human categories. Analysis of the image statistics of our good and bad exemplars shows that variability in low-level features and image structure is higher among bad than good exemplars. A simulation of our neuroimaging experiment suggests that such a difference in variance could account for the observed differences in decoding accuracy. These results are consistent with both low-level models of scene categorization and models that build categories around a prototype.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.