Motion, Perception of

The visual environment of most animals consists of objects that move with respect to one another and to the observer. Detection and interpretation of these motions are not only crucial for predicting the future state of one's dynamic world -- as would be necessary to escape an approaching predator, for example -- but also provide a wealth of information about the 3-D structure of the environment. Not surprisingly, motion perception is one of the most phylogenetically well conserved of visual functions. In primates, who rely heavily on vision, motion processing has reached a peak of computational sophistication and neuronal complexity.

The neuronal processes underlying perceived motion first gained widespread attention in the nineteenth century. Our present understanding of this topic is a triumph of cognitive science, fueled by coordinated application of a variety of techniques drawn from the fields of COMPUTATIONAL NEUROSCIENCE, ELECTROPHYSIOLOGY,ELECTRIC AND MAGNETIC EVOKED FIELDS, PSYCHOPHYSICS, and neuroanatomy. Most commonly the visual stimulus selectivities of individual neurons are assessed via the technique of SINGLE-NEURON RECORDING, and attempts are made to link selectivities to well-defined computational steps, to behavioral measures of perceptual state, or to specific patterns of neuronal circuitry. The product of this integrative approach has been a broad perspective on the neural structures and events responsible for visual motion perception.

Motion processing serves a number of behavioral goals, from which it is possible to infer a hierarchy of computational steps. An initial step common to all aspects of motion processing is detection of the displacement of retinal image features, a process termed "motion detection." In the primate visual system, neurons involved in motion detection are first seen at the level of primary VISUAL CORTEX (area V1; Hubel and Wiesel 1968). Many V1 neurons exhibit selectivity for the direction in which an image feature moves across the retina and hence are termed "directionally selective." These V1 neurons give rise to a larger subsystem for motion processing that involves several interconnected regions of the dorsal (or "parietal") VISUAL PROCESSING STREAMS (Felleman and Van Essen 1991). Most notable among these cortical regions is the middle temporal visual area, commonly known as area MT (or V5) -- a small visual-otopically organized area with a striking abundance of directionally selective neurons (Albright 1993).

Several detailed models have been proposed to account for neuronal motion detection (Borst and Egelhaaf 1993). The earliest was developed over forty years ago to explain motion sensitivity in flying insects. According to this model and its many derivatives, motion is computed through spatiotemporal correlation. This COMPUTATION is thought to be achieved neuronally via convergence of temporally staggered outputs from receptors with luminance sensitivity profiles that are spatially displaced. The results of electrophysiological experiments indicate that a mechanism of this type can account for directional selectivity seen in area V1 (Ganz and Felder 1984).

While motion detection is thus implemented at the earliest stage of cortical visual processing in primates, a number of studies (Shadlen and Newsome 1996) have demonstrated a close link between the discriminative capacity of motion-sensitive neurons at subsequent stages -- particularly area MT -- and perceptual sensitivity to direction of motion. Using a stimulus in which the "strength" of a motion signal can be varied continuously, Newsome and colleagues have shown that the ability of individual MT neurons to discriminate different directions of motion is, on average, comparable to that of the nonhuman primate observer in whose CEREBRAL CORTEX the neurons reside. In a related experiment, these investigators found that they could predictably bias the observer's perceptual report of motion direction by electrically stimulating a cortical column of MT neurons that represent a known direction. Finally, direction discrimination performance was severely impaired by ablation of area MT. In concert, the results of these experiments indicate that MT neurons provide representations of image motion upon which perceptual decisions can be made.

Once retinal image motion is detected and discriminated, the resultant signals are used for a variety of purposes. These include (1) establishing the 3-D structure of a visual scene, (2) guiding balance and postural control, (3) estimating the observer's own path of locomotion and time to collision with environmental objects, (4) parsing retinal image features into objects, and -- perhaps most obviously -- (5) identifying the trajectories of moving objects and predicting their future positions in order to elicit an appropriate behavioral response (e.g., ducking). Computational steps and corresponding neural substrates have been identified for many of these perceptual and motor functions.

Establishing 3-D scene structure from motion and estimating the path of locomotion, for example, both involve detection of complex velocity gradients in the image (e.g., rotation, expansion, tilt; see STRUCTURE FROM VISUAL INFORMATION SOURCES). Psychophysical studies demonstrate that primates possess fine sensitivity to such gradients (Van Doorn and Koenderink 1983) and electrophysiological evidence indicates that neurons selective for specific velocity gradients exist in the medial superior temporal (MST) area, and other higher areas of the parietal stream (Duffy and Wurtz 1991).

Establishing the trajectory of a moving object -- another essential motion-processing function -- is also an area of considerable interest. This task is fundamentally one of transforming signals representing retinal image motions, such as those carried by V1 neurons, into signals representing visual scene motions. Computationally, this transformation is complex (indeed, the solution is formally underconstrained), owing, in part, to spurious retinal image motions that are generated by the incidental overlap of moving objects. Contextual cues for visual scene segmentation play an essential role in achieving this transformation. This process has been explored extensively using visual stimuli that simulate retinal images rendered by one object moving past another (Stoner and Albright 1993). A variety of real-world contextual cues, including brightness differences (indicative of shading, transparency, or differential surface reflectance) and binocular positional disparity ("stereoscopic" cues), have been used in psychophysical studies to manipulate perceptual interpretation of the spatial relationships between the objects in the scene. This interpretation has, in turn, a profound influence upon the motion that is perceived. Electrophysiological experiments have been conducted using stimuli containing similar contextual cues for scene segmentation. Neuronal activity in area MT is altered by context, such that the direction of motion represented neuronally matches the direction of object motion perceived (Stoner and Albright 1992). These results suggest that the transformation from a representation of retinal image motion to one of scene motion occurs in, or prior to, area MT and is modulated by signals encoding the spatial relationships between moving objects.

The final utility of visual motion processing is, of course, MOTOR CONTROL -- for example, reaching a hand to catch a ball, adjusting posture to maintain balance during figure skating, or using smooth eye movements to follow a moving target. The OCULOMOTOR CONTROL system is particularly well understood and has served as a model for investigation of the link between vision and action. The motion-processing areas of the parietal cortical stream (e.g., areas MT and MST) have anatomical projections to brain regions known to be involved in control of smooth pursuit EYE MOVEMENTS (e.g., dorsolateral pons). Electrophysiological data linking the activity of MT and MST neurons to smooth pursuit are plentiful. For one, the temporal characteristics of neuronal responses in area MT are correlated with the dynamics of pursuit initiation, suggesting a causal role. MST neurons respond well during visual pursuit; many even do so through the momentary absence of a pursuit target. The latter finding suggests that such neurons receive a "copy" of the efferent motor command, which may be used to interpret retinal motion signals during eye movements, as well as to perpetuate pursuit when the target briefly passes behind an occluding surface. Finally, neuropsychological studies have shown that smooth pursuit is severely impaired following damage to areas MT and MST. In concert, these studies demonstrate that cortical motion-processing areas -- particularly MT and MST -- forward precise measurements of object direction and speed to the oculomotor system to be used for pursuit generation. Similar visual-motor links are likely to be responsible for head, limb, and body movements.

As evident from the foregoing discussion, basic knowledge of the neural substrates of motion perception has come largely from investigation of nonhuman primates and other mammalian species. The general mechanistic and organizational principles gleaned from this work are believed to hold for the human visual system as well. Neuropsychological studies, in conjunction with recent advances in functional brain imaging tools such as MAGNETIC RESONANCE IMAGING (MRI) and POSITRON EMISSION TOMOGRAPHY (PET), have yielded initial support to this hypothesis. In particular, clinical cases of selective impairment of visual motion perception following discrete cortical lesions have been hailed as evidence for a human homologue of areas MT and MST (Zihl, von Cramon, and Mai 1983). Neuronal activity-related signals (PET and functional MRI) recorded from human subjects viewing moving stimuli have identified a motion-sensitive cortical zone in approximately the same location as that implicated from the effects of lesions (Tootell et al. 1995).

These observations from the human visual system, in combination with fine-scale electrophysiological, anatomical, and behavioral studies in nonhuman species, paint an increasingly rich portrait of cortical motion-processing substrates. Indeed, motion processing is now arguably the most well-understood sensory subsystem in the primate brain. As briefly revealed herein, one can readily identify the computational goals of the system, link them to specific loci in a distributed and hierarchically organized neural system, and document their functional significance in a real-world sensory-behavioral context. The technical and conceptual roots of this success provide a valuable model for the investigation of other sensory, perceptual, and cognitive systems.

See also

Additional links

-- Thomas D. Albright


Albright, T. D. (1993). Cortical processing of visual motion. In J. Wallman and F. A. Miles, Eds., Visual Motion and Its Use in the Stabilization of Gaze. Amsterdam: Elsevier, pp. 177-201.

Borst, A., and M. Egelhaaf. (1993). Detecting visual motion: Theory and models. In J. Wallman and F. A. Miles, Eds., Visual Motion and Its Use in the Stabilization of Gaze. Amsterdam: Elsevier, pp. 3-26.

Duffy, C. J., and R. H. Wurtz. (1991). Sensitivity of MST neurons to optic flow stimuli, I. A continuum of response selectivity to large-field stimuli. J. Neurophysiol. 65(6):1329-1345.

Felleman, D. J., and D. C. Van Essen. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex 1:1-47.

Ganz, L., and R . Felder. (1984). Mechanism of directional selectivity in simple neurons of the cat's visual cortex analyzed with stationary flash sequences. J. Neurophysiol. 51(2):294-324.

Hubel, D. H., and T. N. Wiesel. (1968). Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195:215-243.

Shadlen, M. N., and W. T. Newsome. (1996). Motion perception: Seeing and deciding. Proc. Natl. Acad. Sci. U.S.A. 93(2):628-633.

Stoner, G. R., and T. D. Albright. (1992). Neural correlates of perceptual motion coherence. Nature 358:412-414.

Stoner, G. R., and T. D. Albright. (1993). Image segmentation cues in motion processing: Implications for modularity in vision. J. Cogn. Neurosci. 5(2):129-149.

Tootell, R. B., J. B. Reppas, K. K. Kwong, R. Malach, R. T. Born, T. J. Brady, B. R. Rosen, and J. W. Belliveau. (1995). Functional analysis of human MT and related visual cortical areas using magnetic resonance imaging. J. Neurosci. 15(4):3215-3230.

Van Doorn, A. J., and J. J. Koenderink. (1983). Detectability of velocity gradients in moving random-dot patterns. Vision Res. 23:799-804.

Zihl, J., D. von Cramon, and N. Mai. (1983). Selective disturbance of movement vision after bilateral brain damage. Brain 106(2):313-340.