How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A "selective" path in which candidate objects must be individually selected for recognition and a "non-selective" path in which information can be extracted from global / statistical information. Searching and experiencing a sceneIt is an interesting aspect of visual experience that we can look for an object that is, literally, right in front of our eyes, yet not find it for an appreciable period of time. It is clear that we are seeing something at the object's location before we find it. What is that something and how do we go about finding that desired object? These questions have occupied visual search researchers for decades. While visual search papers have conventionally described search as an important real-world task, the bulk of research had observers looking for targets among some number of distractor items, all presented in random configurations on otherwise blank backgrounds. In the last decade, there has been a surge of work using more naturalistic scenes as stimuli and this has raised the issue of the relationship of the search to the structure of the scene. This paper will briefly summarize some of the models and solutions developed with artificial stimuli and then describe what happens when these ideas confront search in real-world scenes. We will argue that the process of object recognition, required for most search tasks, involves the selection of individual candidate objects because all objects cannot be recognized at once. At the same time, the experience of a continuous visual field tell us that some aspects of a scene reach awareness without being limited by the selection bottleneck in object recognition. Work in the past decade has revealed how this non-selective processing is put to use when we search in real scenes. Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. . As will be briefly reviewed below, it holds that search is necessary because object recognition processes are limited to one or, perhaps, a very few objects at one time. The selection of candidate objects for subseq...
The brain may combine information from different sense modalities to enhance the speed and accuracy of detection of objects and events, and the choice of appropriate responses. There is mounting evidence that perceptual experiences that appear to be modality-specific are also influenced by activity from other sensory modalities, even in the absence of awareness of this interaction. In a series of speeded classification tasks, we found spontaneous mappings between the auditory feature of pitch and the visual features of vertical location, size, and spatial frequency but not contrast. By dissociating the task variables from the features that were cross-modally related, we find that the interactions happen in an automatic fashion and are possibly located at the perceptual level.
Studies have suggested attention-free semantic processing of natural scenes in which concurrent tasks leave category detection unimpaired (e.g., F. Li, R. VanRullen, C. Koch, & P. Perona, 2002). Could this ability reflect detection of disjunctive feature sets rather than high-level binding? Participants detected an animal target in a rapid serial visual presentation (RSVP) sequence and then reported its identity and location. They frequently failed to identify or to localize targets that they had correctly detected, suggesting that detection was based only on partial processing. Detection of targets was considerably worse in sequences that also contained humans, presumably because of shared features. When 2 targets were presented in RSVP, a prolonged attentional blink appeared that was almost eliminated when both targets were detected without being identified. The results suggest rapid feature analysis mediating detection, followed by attention-demanding binding for identification and localization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.