According to the traditional inferential theory of perception, percepts of object motion or stationarity stem from an evaluation of afferent retinal signals (which encode image motion) with the help of extraretinal signals (which encode eye movements). According to direct perception theory, on the other hand, the percepts derive from retinally conveyed information only. Neither view is compatible with a perceptual phenomenon that occurs during visually induced sensations of ego motion (vection). A modified version of inferential theory yields a model in which the concept of extraretinal signals is replaced by that of reference signals, which do not encode how the eyes move in their orbits but how they move in space. Hence reference signals are produced not only during eye movements but also during ego motion (i.e., in response to vestibular stimulation and to retinal image flow, which may induce vection). The present theory describes the interface between self-motion and object-motion percepts. An experimental paradigm that allows quantitative measurement of the magnitude and gain of reference signals and the size of the just noticeable difference (JND) between retinal and reference signals reveals that the distinction between direct and inferential theories largely depends on: (1) a mistaken belief that perceptual veridicality is evidence that extraretinal information is not involved, and (2) a failure to distinguish between (the perception of) absolute object motion in space and relative motion of objects with respect to each other. The model corrects these errors, and provides a new, unified framework for interpreting many phenomena in the field of motion perception.Keywords: direct perception; efference copy; inference; motion perception; self-motion; velocity perception; visual-vestibular interactions
Inferential versus direct perceptionHow do we maintain the visual percept of a stable world while images of our environment move across the retinae during eye movements? Answers to this question can be classified in two main theoretical approaches. According to the traditional view, here called inferential theory, we perceive the motion or stationarity of an object, or of the visual world itself, on the basis of the outcome of a comparison between two neural signals (see e.g., Helmholtz 1910;Jeannerod et al. 1979;MacKay 1972; Mittelstaedt 1990;Sperry 1950;Von Hoist & Mittelstaedt 1950). One signal, here to be called the retinal signal, consists of retinal afferents encoding the characteristics of the movement of the objects' image across the retina. The other signal, encoding concurrent eye movement characteristics, is usually termed the extraretinal signal, because it does not derive from visual afferents (Matin et al. 1969; Mack 1986; see also Matin 1982;. The comparison mechanism treats the two signals as vectors (see, e.g., Mateeff et al. 1991; Willach et al. 1985) and applies a simple rule: when they differ, object motion is perceived; when they are equal, object stationarity is perceived. Wert...