Theories of image segmentation suggest that the human visual system may use two distinct processes to segregate figure from background: a local process that uses local feature contrasts to mark borders of coherent regions and a global process that groups similar features over a larger spatial scale. We performed psychophysical experiments to determine whether and to what extent the global similarity process contributes to image segmentation by motion and color. Our results show that for color, as well as for motion, segmentation occurs first by an integrative process on a coarse spatial scale, demonstrating that for both modalities the global process is faster than one based on local feature contrasts. Segmentation by motion builds up over time, whereas segmentation by color does not, indicating a fundamental difference between the modalities. Our data suggest that segmentation by motion proceeds first via a cooperative linking over space of local motion signals, generating almost immediate perceptual coherence even of physically incoherent signals. This global segmentation process occurs faster than the detection of absolute motion, providing further evidence for the existence of two motion processes with distinct dynamic properties.A fundamental goal of vision is to locate, characterize, and recognize objects. But to determine "what" is "where" the visual system must first determine which parts of the image belong together. This is the problem of image segmentation, central to both human and machine vision. How the brain implements image segmentation is not known, although various physiological mechanisms have been proposed (for a review, see ref. 1).Objects are distinguished not only by feature contrasts at their boundaries with the background but also by the similarity of feature properties within their boundaries. Two types of segmentation process may exist to exploit these two fundamental distinctions: a local edge-based process that marks differences in visual attributes and a global region-based process that finds homogeneous areas by integrating information about attributes over space. Edge detection is fundamental to many machine vision algorithms (2) but rarely flawless on its own in segmenting an image into relevant regions. In natural images, edges are disrupted by noise, occlusions, and interference from other edges. Edges are also ambiguous: luminance edges, for example, may arise from many distinct physical causes. Therefore, edge detection segmentation algorithms typically require fragile and adaptive adjustment of thresholds, iterations on multiple spatial scales, and special line-completion methods (3, 4). Region-based segmentation algorithms typically "grow" regions from seed patches, accreting all those surrounding areas that share the same properties as the seed, the end result being a set of internally homogeneous regions (5). But region-based algorithms also face problems in defining and setting thresholds for homogeneity, particularly in noisy or complex images characterized by m...