Psychophysical, physiological, and modeling studies converge to indicate that object motion is recovered by integrating motion signals analyzed across space and time by direction-selective sensors with spatially limited receptive fields (Adelson & Movshon,
1982; Fenema & Thompson,
1979; Grossberg, Mingolla, & Viswanathan,
2001; Movshon, Adelson, Gizzi, & Newsome,
1986; Simoncelli & Heeger,
1998). Although different stimuli, such as plaids, aperture stimuli, or random dot kinematograms, have been used to assess the characteristics of the motion integration process (Lorenceau & Shiffrar,
1992; Mingolla, Todd, & Norman,
1992; Rubin & Hochstein,
1993; Stoner & Albright,
1992; Watamaniuk & Sekuler,
1992; Williams & Sekuler,
1984; Wilson & Kim,
1994), these studies have mostly focused on the conditions that yield the perception of global coherent motion or on the contrary on the perception of local—or transparent—component motion, namely the conditions under which motion integration or segmentation occurs (Mingolla et al.,
1992; Rubin & Hochstein,
1993; for reviews, see Lorenceau & Shiffrar,
1999; Stoner & Albright,
1994). A consensual view is that integration and segmentation are the two faces of the same coin. Thus, either component motions are merged into an integrated percept or are segregated and considered independent motions. In these studies, the motion of 2D features proved of primarily importance for motion integration, as their salience, reliability, and status gate the mere possibility of combining component motions distributed across space into a whole. Consequently, it was proposed that motion integration proceeds in two stages: 1D local motion would first be extracted, irrelevant features, such as T-junctions due to occlusion, would be discarded, and mutually consistent component motions would finally be integrated into a single moving object at a second stage. Computational models based on this two stage scheme have successfully accounted for much of existing data (Grossberg et al.,
2001; Koechlin, Anton, & Burnod
1999; Líden & Pack,
1999; Nowlan & Sejnowski,
1994; Simoncelli & Heeger
1998), although accounting for the influence of form and spatial context (Lorenceau & Alais,
2001; McDermott, Weiss, & Adelson,
2001) remains a difficult challenge for these models (but see Weiss & Adelson,
1996).