Human observers are capable of tracking multiple objects among identical distractors based only on their spatiotemporal information. Since the first report of this ability in the seminal work of Pylyshyn and Storm (1988, Spatial Vision, 3, 179-197), multiple object tracking has attracted many researchers. A reason for this is that it is commonly argued that the attentional processes studied with the multiple object paradigm apparently match the attentional processing during real-world tasks such as driving or team sports. We argue that multiple object tracking provides a good mean to study the broader topic of continuous and dynamic visual attention. Indeed, several (partially contradicting) theories of attentive tracking have been proposed within the almost 30 years since its first report, and a large body of research has been conducted to test these theories. With regard to the richness and diversity of this literature, the aim of this tutorial review is to provide researchers who are new in the field of multiple object tracking with an overview over the multiple object tracking paradigm, its basic manipulations, as well as links to other paradigms investigating visual attention and working memory. Further, we aim at reviewing current theories of tracking as well as their empirical evidence. Finally, we review the state of the art in the most prominent research fields of multiple object tracking and how this research has helped to understand visual attention in dynamic settings.
Humans understand text and film by mentally representing their contents in situation models. These describe situations using dimensions like time, location, protagonist, and action. Changes in 1 or more dimensions (e.g., a new character enters the scene) cause discontinuities in the story line and are often perceived as boundaries between 2 meaningful units. Recent theoretical advances in event perception led to the assumption that situation models are represented in the form of event models in working memory. These event models are updated at event boundaries. Points in time at which event models are updated are important: Compared with situations during an ongoing event, situations at event boundaries are remembered more precisely and predictions about what happens next become less reliable. We hypothesized that these effects depend on the number of changes in the situation model. In 2 experiments, we had participants watch sitcom episodes and measured recognition memory and prediction performance for event boundaries that contained a change in 1, 2, 3, or 4 dimensions. Results showed a linear relationship: the more dimensions changed, the higher recognition performance was. At the same time, participants' predictions became less reliable with an increasing number of dimension changes. These results suggest that updating of event models at event boundaries occurs incrementally.
Observers can visually track multiple objects that move independently even if the scene containing the moving objects is rotated in a smooth way. Abrupt scene rotations yield tracking more difficult but not impossible. For nonrotated, stable dynamic displays, the strategy of looking at the targets' centroid has been shown to be of importance for visual tracking. But which factors determine successful visual tracking in a nonstable dynamic display? We report two eye tracking experiments that present evidence for centroid looking. Across abrupt viewpoint changes, gaze on the centroid is more stable than gaze on targets indicating a process of realigning targets as a group. Further, we show that the relative importance of centroid looking increases with object speed.Watching and understanding a football game on television requires the ability to keep track of multiple moving objects: For example, in a scene in front of the goal, at least two players (offence and goal-keeper) and the ball have to be tracked. More complex situations (e.g., an off-side position) involve even more players. In contrast to real life scenarios, in television a tactic move (e.g., a counterattack) is often shown in sequential shots from
We examined whether surface feature information is utilized to track the locations of multiple objects. In particular, we tested whether surface features and spatiotemporal information are weighted according to their availability and reliability. Accordingly, we hypothesized that surface features should affect location tracking across spatiotemporal discontinuities. Three kinds of spatiotemporal discontinuities were implemented across five experiments: abrupt scene rotations, abrupt zooms, and a reduced presentation frame rate. Objects were briefly colored across the spatiotemporal discontinuity. Distinct coloring that matched spatiotemporal information across the discontinuity improved tracking performance as compared with homogeneous coloring. Swapping distinct colors across the discontinuity impaired performance. Correspondence by color was further demonstrated by more mis-selected distractors appearing in a former target color than distractors appearing in a former distractor color in the swap condition. This was true even when color never supported tracking and when participants were instructed to ignore color. Furthermore, effects of object color on tracking occurred with unreliable spatiotemporal information but not with reliable spatiotemporal information. Our results demonstrate that surface feature information can be utilized to track the locations of multiple objects. This is in contrast to theories stating that objects are tracked based on spatiotemporal information only. We introduce a flexible-weighting tracking account stating that spatiotemporal information and surface features are both utilized by the location tracking mechanism. The two sources of information are weighted according to their availability and reliability. Surface feature effects on tracking are particularly likely when distinct surface feature information is available and spatiotemporal information is unreliable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.