Most of our visual experience in daily life consists of surfaces. “The surface is where most of the action is” (Gibson,
1979) because light reflected from the surfaces in front of our eyes contains all the visual information we get from the environment. The representation of surfaces is a stage of visual information processing posed between the early detection of visual features and later stages of object recognition (Nakayama, He, & Shimojo,
1995). Surface organization affects several aspects of visual processing, including search (Nakayama & Silverman,
1986), attention (Blaser, Pylyshyn, & Holcombe,
2000; Sohn, Papathomas, Blaser, & Vidnyánszky,
2004), and motion (Stoner & Albright,
1996). A surface can be described by its location and appearance. A single surface usually lies at a smooth and continuous plane rather than across discontinuous depths and consists of relatively homogeneous colors and textures. If our visual system processes visual features in the unit of a surface, the features such as color, texture, and motion comprising a surface may be associated together with the depth at which the surface is located.
Conjunction of more than one visual property into a single representation has been investigated by studies of selective adaptation. Human vision shows selective adaptation not only to particular features of visual stimuli, such as color, depth, orientation, or motion, but also to the conjunction of such features. In the classic McCollough effect, selective adaptation to a red, vertical stimulus interleaved with a green, horizontal stimulus leads to a percept of green with subsequent exposure to a vertical black-and-white stimulus and of red with a horizontal stimulus (McCollough,
1965). Such contingent aftereffects also have been reported for color and motion (Favreau, Emerson, & Corballis,
1972) and for motion and depth (Anstis,
1974; Nawrot & Blake,
1989; Verstraten, Verlinde, Fredericksen, & van de Grind,
1994). These phenomena lead to questions about the locations in the visual processing stream in which multiple features are combined. Humphrey and Goodale (
1998) argued that orientation-contingent color aftereffects (McCollough effects) are based in large part on neural adaptation in early visual area such as V1. On the other hand, Domini, Blaser, and Cicerone (
2000) showed that the locus of combined processing of color and depth is likely to lie at a rather later stage, beyond the site of binocular matching. In their more recent study, these authors proposed that the representation of surfaces is a stage where features such as color and orientation are linked with particular depths (Blaser & Domini,
2002).
Among other visual properties, conjunction of motion and stereopsis into a single representation has been extensively investigated. Anstis and Harris (
1974) reported that the motion aftereffect (MAE) is contingent on binocular disparity. They had their observers adapt to two alternating textured discs rotating in opposite directions with different binocular disparity, one located in front of fixation and the other behind fixation. In the test phase, a static textured pattern was presented either in front or behind the fixation. The direction of the MAE was always opposite to the motion direction of the adapting disc at a given depth plane. Verstraten et al. (
1994) showed the same disparity-contingent aftereffects with transparently moving dots. After adapting to a bidirectional transparent motion display at different depth planes, observers reported transparent MAEs that were contingent on the depth planes during adaptation. They also reported that the magnitude of the MAE decreased as the difference in disparity between adapting and test stimuli increased, implying adaptation of neurons sensitive to both a specific motion direction and specific disparity. Nawrot and Blake (
1989,
1991a) reported the inverse phenomenon, a depth aftereffect contingent on motion directions. In their experiment, adaptation to transparent dot fields that were moving in opposite directions at different disparities generated a rotational aftereffect in a structure from motion (SfM) display with ambiguous directions. The SfM display consists of two groups of dots moving in opposite directions (left and right) that generate a three-dimensional form (Todd,
1984). The speed profile of the dots across space is such that the display appears to be a dotted transparent globe or cylinder rotating clockwise or counterclockwise around a vertical axis. Because the two groups of dots are both at zero disparity, the depth order of the leftward and rightward motion is ambiguous. Observers can see either leftward motion on the front surface of the globe or on the back surface of the globe. Nawrot and Blake (
1989,
1991a) capitalized on this ambiguity to test whether adaptation to unambiguous motion could influence the perceptual interpretation. After observers adapted to the leftward motion at crossed (in front) and rightward at uncrossed disparities, they perceived that the front surface of an ambiguous SfM globe was moving rightward. These psychophysics studies demonstrate that motion and depth are co-registered at some level of visual processing.
Association between motion and depth may be supported by concurrent coding of these properties in the motion-sensitive visual area, MT. The importance of area MT in motion processing is clear, as it contains neurons correlated with and a determinant of motion perception (for a review, see Britten,
2004). About two thirds of MT neurons are also selective for disparity, but they are found in discrete patches, which are separated by regions that have poor disparity tuning (DeAngelis & Newsome,
1999). However, this spatial map for disparity selectivity does not have a systematic relationship with that for motion direction, as one would expect for motion in three-dimensional depth if a particular direction was coded dependent on a particular disparity (Maunsell & van Essen,
1983).
What then is the function of disparity in the motion-sensitive area MT? Disparity sensitivity may provide cues for motion segmentation and for the representation of surfaces. Suppression effects between opposite motion signals in MT become weaker if two directions are presented at different depth planes (Bradley, Qian, & Andersen,
1995). Therefore, transparent moving surfaces at different disparities are represented independently in MT whereas opposite motion signals from a same surface tend to cancel each other. This depth constraint is thought to provide a way of confining the motion integration to a particular surface, aiding motion segmentation (Born & Bradley,
2005). In fact, disparity sensitivity of MT cells is involved in depth perception of frontoparallel surfaces, as evidenced by biases in disparity discrimination from stimulation of disparity-selective MT neurons (DeAngelis, Cumming, & Newsome,
1998). The segmentation process may further play an important role in the perception of three-dimensional structure, for example, arranging surfaces in depth relative to one another. Bradley, Chang, and Andersen (
1998) showed that MT responses can reflect the perceived depth order of surfaces. They had animals view an ambiguous SfM rotating cylinder at zero disparity. When animals reported a certain depth order, say leftward motion in front and rightward back, similar neural responses were generated as those when they viewed the leftward motion at a depth plane in front of rightward motion as defined by binocular disparity. When they reported the reverse surface order with the very same stimulus, the responses changed as if they were viewing the same directions at the reversed, disparity-defined depth order. This report implies that disparity-defined depth order and ordinal representation without disparities involve the same neural mechanism. The disparity coding in MT may potentially be a part of more general process of perceiving depth order, which surface is in front of which, rather than just absolute disparity (Grunewald, Bradley, & Andersen,
2002).
In the present study, we report evidence for a process of perceiving depth order that occurs across a broad range of binocular disparities. Using the adaptation paradigm, we measured the amount of selective adaptation to motion information that is associated with absolute disparity and that with the surface depth order. The present study provides some new information about the types of depth information used in human motion perception.