Free
Research Article  |   February 2009
Opponent motion interactions in the perception of structure from motion
Author Affiliations
Journal of Vision February 2009, Vol.9, 2. doi:10.1167/9.2.2
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Padma B. Iyer, Alan W. Freeman; Opponent motion interactions in the perception of structure from motion. Journal of Vision 2009;9(2):2. doi: 10.1167/9.2.2.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Motion provides important cues for the perception of depth and object structure. The kinetic depth effect illustrates this phenomenon: dots moving in a two-dimensional plane can produce a vivid perception of a rotating three-dimensional object. We studied the origin of this depth percept in a psychophysical study employing inducing and test stimuli. The inducing stimulus, containing dots moving with simple harmonic motion in the fixation plane, was perceived as a rotating cylinder. The test stimulus had binocular disparity that placed it close to either the near or far surface of the cylinder. We found that sensitivity to the test was lower when it moved in the opposite direction to the adjacent surface of the inducing stimulus than when the two stimuli moved in the same direction. We also simplified the inducing stimulus by using two uniform arrays of dots translating in opposite directions. Subjects saw one array as being closer than the other, and test sensitivity was again reduced when the test was close to a surface moving in the opposite direction. These results support the idea that there are suppressive interactions between opposing motions at the same depth, leading to a single perceived direction of motion at each depth.

Introduction
The objects that surround us often move relative to each other, either because we move our viewing point or because the objects are in motion themselves. This relative motion provides valuable information when we attempt to interpret and interact with our environment, as seen in the following examples. First, one can avoid bumping into walls when walking down a passage by aiming to roughly equalize the apparent speed of the walls on either side. Second, objects that appear to move faster than others during head translation are likely to be closer. Third, a surface with higher image speeds at its center than at its edges suggests a rotating three-dimensional structure. All three examples belong to the family of capabilities known as structure from motion. 
Given the importance of correctly interpreting relative motion, it is of real interest to know how the nervous system processes the relative velocities of adjacent objects. A useful way to study this issue is the kinetic depth effect (Nawrot & Blake, 1989; Wallach & O'Connell, 1953). In one form of this effect, a transparent cylinder is randomly coated with dots and an image of the cylinder is projected onto a planar surface. As the cylinder rotates, the dots oscillate back and forth in a two-dimensional plane. There is no third dimension present in the projected image and yet an observer has a vivid perception of a rotating cylinder. The reader can experience this percept by looking at Supplementary Figure S1. While the rotation is readily appreciated, the direction of rotation is not specified by the stimulus and appears to change with time. Indeed, the typical observer sees the rotation direction reverse every few seconds with the front surface appearing to move leftward and then rightward in a never-ending cycle. 
Evidently, there is a brain mechanism that prevents the perception of opposing motions at a single location in visual space. The opposing motions are instead moved apart by assigning them differing depths. How does this segregation of opposing motions occur? Nawrot and Blake (1991) modeled this process with a neural network containing cells selectively responsive to narrow ranges of motion direction and depth. Cells tuned to opposite motion directions but the same depth suppressed each other, while some cells tuned to different depths facilitated each other. Running the model showed that it oscillated between two states. In one state, the active cells were those tuned to near leftward-moving objects and those other cells that were tuned to far rightward movement. The active cells in the second state were tuned to near rightward and far leftward movements. The model therefore produced states corresponding to the two perceptual states in the kinetic depth effect experienced by human observers. 
Bradley, Qian, and Andersen (1995) provided physiological support for this model. They recorded action potentials from single cells in area MT of the macaque cortex. While the monkey fixated a stationary spot, a stimulus was presented within the receptive field of the recorded cell. The stimulus consisted of an array of dots moving in the cell's preferred direction and another array moving in the opposite direction. The preferred array had the binocular disparity yielding the best response from the cell, and the disparity of the other array was varied. The cell's response was suppressed when the two arrays had the same disparity, and the suppression progressively declined as the disparity difference between the two arrays was increased. That is, suppression was maximal for opposing motions in the same depth plane and was reduced by presenting the two motions in differing depth planes. 
In seeking to understand the mechanisms underlying the kinetic depth effect, the published literature therefore provides both a model that assumes a number of mutually suppressive neural populations, and physiological evidence in support of the model. What is lacking is psychophysical evidence for interaction between responses to stimuli of differing motion direction and depth. The aim of this paper is to provide such evidence. Our approach is to use an inducing stimulus and test stimulus. The inducing stimulus consists of an array of dots moving leftward and rightward so that the subject perceives the dots that are moving in one direction as being nearer than the other dots. The test is a vertical strip of dots briefly added to the inducing stimulus. The test stimulus has binocular disparity placing it close to either the near or far surface of the inducing stimulus, and it moves in either the same direction as that surface or in the opposite direction. Subjects were required to make a two-alternative forced-choice judgment about the test. We assumed that to show an interaction between the test and inducing stimuli, the judgment should involve either the perceived motion or depth of the test. We therefore randomly varied the test's slant and asked subjects to indicate whether the top or bottom of the test was nearer. Data were collected when the test's direction matched that of the inducing surface to which it was close and when there was a mismatch. 
We describe three experiments. In the first, the inducing stimulus provides three cues to structure: binocular disparity, increased density of dots at the sides of the stimulus, and motion. All three of these cues contribute to the percept of a cylinder rotating in a fixed direction. The remaining experiments progressively strip away these cues to yield the minimal conditions for which the stimulus remains three-dimensional. The second experiment removes binocular disparity. In this case subjects again perceive a rotating cylinder but with ambiguous rotation direction. In the third experiment the inducing stimulus has a uniform dot density, thereby removing shape cues. Subjects perceived the inducing stimulus in this experiment as two transparent planes sliding over each other. Consistent with previous work (Watanabe, 1999), one plane appeared to be nearer than the other. We show that responses to the test stimulus are affected by the inducing stimulus in all three experiments and that cue removal only slightly diminishes the strength of interaction between inducing and test stimuli. Some of the results have previously been published in abstract form (Iyer & Freeman, 2006). 
Methods
Apparatus
Stimuli were generated with a Macintosh PowerPC G5 computer and displayed on a cathode ray tube monitor. The software used to generate the stimuli and perform the experiments ran within Matlab (The MathWorks) and included functions from the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997). The display subtended 13° horizontally by 10° vertically at the eye and had a spatial resolution of 0.33 mm/pixel, a frame rate of 72 Hz, a luminance of 50 cd/m2, and was white (x = 0.294, y = 0.323). Left-eye stimuli were located on the left half of the display and right-eye stimuli on the right. The optical distance from eye to monitor was 1140 mm. The two monocular views were separated through the use of a septum and a mirror stereoscope (Clement Clarke synoptophore, model 2052-224). The two arms of the stereoscope could be independently moved, and subjects adjusted them to binocularly fuse a black border present in each of the monocular views. The inner dimensions of the border were 2.5° × 2.5° and the border width was 0.25°. A black fixation dot, 0.1° in diameter, was placed at the center of each bordered area. The luminance of the border, fixation dot, inducing dots, and test dots was 1.7 cd/m2. The binocular disparities used for dots in the inducing and test stimuli meant that sub-pixel movement of the dots was required. These movements were accomplished using the anti-aliasing capabilities of the Psychophysics Toolbox. 
Stimuli
Experiment 1
The inducing stimulus in Experiment 1 is illustrated in Figure 1A. It consisted of 30 black dots, each dot being circular and 0.1° in diameter. The dots were contained in a 2° × 2° square centered in the bordered area of each monocular view. The dots moved in horizontal simple harmonic motion, meaning that the horizontal position of each dot was a sinusoidal function of time. The oscillation was centered on the vertical midline of the bordered area and had a frequency of 0.2 Hz. Dots were randomly distributed from top to bottom of their area and had starting phases randomly assigned from a uniform density ranging over the full 360°. Subjects perceived the inducing stimulus as a rotating cylinder, as shown in the figure, with its front surface moving either leftward or rightward. Each dot was given a binocular disparity in order to make the motion direction unambiguous. This was done by incrementing the starting phase of left-eye dots and decrementing the starting phase of right-eye dots by equal amounts. The magnitude of the disparity for dots passing through the midline was 3 min; this value is equal to the binocular disparity of the nearest point of a physical cylinder with the same diameter as the perceived cylinder, assuming a subject with an interocular distance of 57 mm. A fixation dot, placed at the center of the inducing stimulus, had the same size as the moving dots, was stationary, and had zero binocular disparity. 
Figure 1
 
(A) Stimuli in Experiments 1 and 2. The inducing stimulus consisted of an array of randomly positioned black dots moving horizontally in simple harmonic motion. The test stimulus, added briefly to the existing dots, consisted of a transparent vertical strip of dots moving leftward or rightward across the midline of the inducing stimulus. Subjects perceived the inducing stimulus as a cylinder whose front surface moved either left or right. The test stimulus was given a base binocular disparity that placed it close to either the front or back surface of the virtual cylinder, and a slant disparity that made the top of the test appear nearer or further than the bottom. On each trial, subjects were required to indicate whether the top or bottom appeared nearer. (B) Stimuli in Experiment 3. In this case the inducing stimulus consisted of one array of dots translating leftward and another array translating rightward. Subjects perceived one array to be nearer than the other. The test stimulus was very similar to that used in Experiments 1 and 2.
Figure 1
 
(A) Stimuli in Experiments 1 and 2. The inducing stimulus consisted of an array of randomly positioned black dots moving horizontally in simple harmonic motion. The test stimulus, added briefly to the existing dots, consisted of a transparent vertical strip of dots moving leftward or rightward across the midline of the inducing stimulus. Subjects perceived the inducing stimulus as a cylinder whose front surface moved either left or right. The test stimulus was given a base binocular disparity that placed it close to either the front or back surface of the virtual cylinder, and a slant disparity that made the top of the test appear nearer or further than the bottom. On each trial, subjects were required to indicate whether the top or bottom appeared nearer. (B) Stimuli in Experiment 3. In this case the inducing stimulus consisted of one array of dots translating leftward and another array translating rightward. Subjects perceived one array to be nearer than the other. The test stimulus was very similar to that used in Experiments 1 and 2.
The test stimulus, briefly added to the inducing stimulus on each trial, was transparent and therefore occluded none of the inducing stimulus. Its 30 dots were randomly distributed from top to bottom of the inducing stimulus but had starting phases restricted to one eighth of the sinusoidal cycle, as illustrated in Figure 1. The test appeared with its leading edge at the vertical midline of the inducing stimulus, moved in simple harmonic motion with the same frequency as the inducing stimulus, and disappeared when its trailing edge reached the midline. The two monocular views of the test stimulus differed, giving it binocular disparity. Three base disparities were used, 3 min crossed and uncrossed, and 0 min. The crossed and uncrossed disparities placed the test stimulus at about the same depth as the perceived front and back, respectively, of the inducing stimulus. The third disparity, 0 min, placed the test stimulus at the axis of the inducing stimulus; this value was used as a control measurement. A disparity gradient was also added along the vertical extent of the test, meaning that the top of the test appeared either nearer or further than the bottom. This slant disparity was proportional to the vertical distance from the horizontal midline of the inducing stimulus. 
Experiment 2
The inducing stimulus in Experiment 2 differed from that in Experiment 1 in only one respect: all dots had zero binocular disparity. The inducing stimulus was therefore ambiguous in that its front surface could be perceived as moving either left or right. The test stimulus in Experiment 2 was identical to that in Experiment 1. 
Experiment 3
The inducing stimulus in Experiment 3 is shown in Figure 1B. The starting positions of the dots were randomly distributed across the width and height of the stimulus. Half of the 30 dots translated leftward and the other half translated rightward. As each dot met the stimulus boundary it was relocated to the opposite boundary at the same height. The speed of the dots, 1.3 deg/s, equaled the maximum speed of the dots in Experiments 1 and 2. Dot size, the area within which dots moved, and the fixation point were the same as in Experiments 1 and 2. The test stimulus had a width equal to one quarter of the inducing stimulus width. It translated at the same speed and in the same direction as one of the two inducing arrays. Otherwise, the test stimulus was the same as that in the other experiments. 
Procedure
All data were collected in a dark room, and the subject was seated comfortably with the chin resting on the stereoscope chin rest to reduce head movements. The subject was asked to fixate on the stationary dot at the center of the stimulus in order to reduce eye movements. On each trial, the subject was presented with the inducing stimulus followed by a test stimulus. In Experiment 1, the inducing stimulus was displayed for 3 s before the test stimulus appeared. In Experiments 2 and 3, where the inducing stimulus direction was ambiguous, the subject waited until the front surface moved in the required direction and then triggered the test stimulus. In all three experiments, all dots apart from the fixation dot disappeared at the end of the test stimulus. The subject's response then triggered the next trial. The magnitude of the test's slant disparity was fixed across all trials of a run, but the direction of the slant was randomly assigned from trial to trial so that the top of the test was nearer on about half of the trials. At the end of each trial, the subject indicated with a key press whether the top or bottom appeared nearer. Incorrect responses were signaled with auditory feedback. There were 20 trials per run and 7–8 runs for each combination of the independent variables. Those variables were the 3 base disparities, and 2 directions of motion, of the test stimulus. 
The inducing stimuli in Experiments 2 and 3 were ambiguous, in that the surface perceived as nearer could move either leftward or rightward. It is possible that the onset of the test stimulus could alter this perceived direction. We ran a pilot study to check whether this was a frequent occurrence. For the purposes of the pilot study, the inducing stimulus was continued for 3 s after the test stimulus finished, and subjects were asked whether the front surface of the inducing stimulus had changed direction. The fraction of trials in which there was a direction change, averaged across subjects, was 7%. Given that the change in direction could have occurred at any time from the onset of the test stimulus to 3 s after test stimulus offset, and that test duration was close to half a second, the fraction of test stimuli during which there was a change in inducing stimulus direction was probably considerably less that 7%. The lability of the inducing stimulus can therefore be expected to have very little impact on the results of Experiments 2 and 3. 
Subjects
Four subjects, AK, AR, PI, and TC, participated in the investigation. They were all female, wore their usual optical correction, if any, had a visual acuity better than 6/6, and had a stereo-threshold of less than 1 min. Except for the author, PI, subjects were paid volunteers entirely naive about the purpose of the experiments. The University of Sydney Human Research Ethics Committee approved the experiments, and subjects consented willingly to the procedures. 
Data analysis
Psychometric functions of slant disparity were collected, and thresholds were obtained from these functions by finding the disparity for which responses were 75% correct. Confidence intervals for the thresholds were estimated with a Monte Carlo procedure. Each observation is characterized by the number of trials used to estimate it ( n) and the fraction of trials ( p) for which the response was correct. These data were used to draw a sample from the binomial density ( n, p). When all points on the function had been drawn in this way, a cumulative Gaussian function was fitted to the points and a new threshold obtained as the disparity yielding 75% correct responses. This procedure was repeated to obtain 100 estimates of the threshold, and the confidence interval was estimated from this sample of thresholds. 
Results
Experiment 1: Unambiguous rotation
The inducing stimulus in the first experiment, illustrated in Figure 1A, was an array of dots moving horizontally in simple harmonic motion. The stimulus was perceived as a rotating cylinder, and the rotation direction was fixed by providing each dot with binocular disparity. The aim here was to use an unambiguous inducing stimulus as a reference point for later experiments, which used ambiguous inducing stimuli. Figure 1A also shows the test stimulus, which was used to find whether the inducing stimulus had any effect on the response to a test stimulus located close to one of the surfaces of the inducing stimulus. The binocular disparity of the test stimulus placed it close to either the front or back surface of the cylinder. The test moved at the same speed as the cylinder and its direction was either the same as or opposite to that of the surface to which it was closest. These two conditions are subsequently labeled Match and Mismatch, respectively. As a control condition, the test was also placed at the fixation plane, where it appeared to be aligned with the axis of the cylinder. 
Figure 2 shows the results for subject TC. The test stimulus had a binocular disparity gradient from top to bottom, making the top appear nearer or further than the bottom. The horizontal axis of Figure 2A gives the magnitude of the disparity difference between top and bottom. The sign of the disparity difference was randomly assigned from trial to trial and the subject's task was to indicate whether the top or bottom of the test was nearer. The vertical axis gives the probability that the subject made the correct choice. The graphs on the left, middle, and right of Figure 2A depict the cases for which the binocular disparity of the test placed it close to the cylinder's front surface, back surface, and middle, respectively. Open circles indicate that the test stimulus matched the motion direction of the cylinder surface to which it was closest, and filled circles indicate that the test and cylinder surface moved in opposite directions. For the control case, on the right of Figure 2A, the test appeared to be equally distant from both cylinder surfaces and therefore cannot be assigned the match or mismatch labels. Here, open triangles indicate leftward movement and filled triangles rightward. The curves are cumulative Gaussian functions fitted to the data. In common with previous studies (reviewed by Watson & Pelli, 1983) we found that the fit was improved by making the horizontal axis logarithmic. Error bars give 95% confidence intervals. 
Figure 2
 
Responses from one subject in Experiment 1. (A) The horizontal axis shows the slant disparity of the test stimulus, measured as the difference in binocular disparities of the top and bottom of the stimulus. The vertical axis shows the probability that the subject correctly indicated whether the top or bottom was nearer. The left and middle graphs give the results when the test was positioned close to the front and back surfaces of the virtual cylinder, respectively. Open circles indicate that the test stimulus moved in the direction matching the cylinder surface to which it was closest, and filled circles indicate a mismatch in direction. The subject required a higher slant disparity to respond correctly in the mismatch case. The right graph indicates the control case in which the test was positioned at the middle of the virtual cylinder. Leftward and rightward movements of the test produce very similar responses. For all three graphs, the curve represents a cumulative Gaussian function fitted to the data. (B) A threshold was calculated from each fitted psychometric function in part A of the figure by finding the slant disparity at which the probability of a correct response was 75%. Error bars indicate 95% confidence intervals. (C) The thresholds in part B of the figure are replotted here normalized by the mean of the leftward and rightward control thresholds. Thresholds greater, and less than, the control value are labeled suppression, and facilitation, respectively.
Figure 2
 
Responses from one subject in Experiment 1. (A) The horizontal axis shows the slant disparity of the test stimulus, measured as the difference in binocular disparities of the top and bottom of the stimulus. The vertical axis shows the probability that the subject correctly indicated whether the top or bottom was nearer. The left and middle graphs give the results when the test was positioned close to the front and back surfaces of the virtual cylinder, respectively. Open circles indicate that the test stimulus moved in the direction matching the cylinder surface to which it was closest, and filled circles indicate a mismatch in direction. The subject required a higher slant disparity to respond correctly in the mismatch case. The right graph indicates the control case in which the test was positioned at the middle of the virtual cylinder. Leftward and rightward movements of the test produce very similar responses. For all three graphs, the curve represents a cumulative Gaussian function fitted to the data. (B) A threshold was calculated from each fitted psychometric function in part A of the figure by finding the slant disparity at which the probability of a correct response was 75%. Error bars indicate 95% confidence intervals. (C) The thresholds in part B of the figure are replotted here normalized by the mean of the leftward and rightward control thresholds. Thresholds greater, and less than, the control value are labeled suppression, and facilitation, respectively.
The mismatch curves in Figure 2A are clearly displaced to the right of the match curves, indicating that the subject was less sensitive to test slant when the test was in the neighborhood of an inducing stimulus moving in the opposite direction. To quantify the gap between the match and mismatch conditions we calculated thresholds, defined as the slant disparity at which the probability of a correct choice was 75% (halfway between random and completely correct responses). The thresholds are shown in Figure 2B. The mismatch thresholds are higher than the match thresholds and the fact that the mismatch thresholds lie well outside the 95% confidence intervals for the match thresholds shows that this threshold difference is highly significant. 
There are at least two possible explanations for the difference between thresholds. First, a subject's sensitivity to test stimulus slant may be suppressed by the surface moving in the opposite direction in much the same depth plane. Alternatively, sensitivity to the test stimulus may be facilitated by a surface moving in the same direction at the same depth or with opposite direction in a different depth plane. To choose between these possibilities we needed a control measurement in which the test stimulus differed from both cylinder surfaces in either depth or velocity so that, as far as possible, sensitivity to the test stimulus was uninfluenced by the inducing stimulus. The control, consisting of the test stimulus located at the cylinder's axis, yielded the thresholds shown on the right of Figure 2B. The confidence intervals show that the control measurements resulting from leftward and rightward stimulus movements were not significantly different, and the two measurements were therefore averaged to obtain the dashed line in the figure. Finally, to show thresholds relative to the control value, we divided all thresholds by it. The results, in Figure 2C, show that the measurements in the match condition do not differ significantly from the control value, while the mismatch measurements significantly exceed it. Responses to the control test stimulus are assumed to be minimally influenced by the inducing stimulus, and the area below the dashed line is therefore labeled as Facilitation of the test response by the inducing stimulus. Similarly, the area above the dashed line is labeled as Suppression. According to this argument, the sensitivities obtained in the match condition are uninfluenced by the inducing stimulus while those in the mismatch condition are suppressed. 
Figure 3 shows the results for all subjects. All the mismatch thresholds are significantly higher than match thresholds. For three out of four subjects the match thresholds do not differ significantly from the control level and the mismatch thresholds are in the suppression range. The remaining subject differs in that both relative thresholds are lower. In general, then, we have shown that subjects' discrimination of the test stimulus is worsened by the presence of an opposing motion in its vicinity and that for most subjects the interaction between mismatching stimuli is suppressive. 
Figure 3
 
Thresholds in Experiment 1. Thresholds during unambiguous rotation were measured as shown in Figure 2. The vertical axis shows normalized threshold, obtained by dividing each value by the threshold obtained when the test was located in the middle of the virtual cylinder. Raw thresholds are shown in Supplementary Figure S3. Data for the four subjects are shown, as listed on the horizontal axis. The upper graph is for test stimuli close to the cylinder's front surface and the lower graph for the back surface. Open columns indicate that the test moved in the same direction as the cylinder surface to which it was closest, and shaded columns are for movement in opposite directions. Error bars indicate the 95% confidence intervals. In general, match thresholds do not differ significantly from the control level whereas mismatch thresholds are elevated.
Figure 3
 
Thresholds in Experiment 1. Thresholds during unambiguous rotation were measured as shown in Figure 2. The vertical axis shows normalized threshold, obtained by dividing each value by the threshold obtained when the test was located in the middle of the virtual cylinder. Raw thresholds are shown in Supplementary Figure S3. Data for the four subjects are shown, as listed on the horizontal axis. The upper graph is for test stimuli close to the cylinder's front surface and the lower graph for the back surface. Open columns indicate that the test moved in the same direction as the cylinder surface to which it was closest, and shaded columns are for movement in opposite directions. Error bars indicate the 95% confidence intervals. In general, match thresholds do not differ significantly from the control level whereas mismatch thresholds are elevated.
Experiment 2: Ambiguous rotation
An array of dots moving horizontally in simple harmonic motion gives the appearance of a cylinder rotating about a vertical axis. The direction of rotation is not specified by the stimulus and is therefore ambiguous: at any given time, an observer sees the front surface as moving either left or right. In Experiment 2 we removed the binocular disparity cue from the inducing stimulus dots, producing an ambiguous rotation direction. We therefore asked our subjects to wait until the front surface of the cylinder was moving in a specific direction, and then to trigger a test stimulus. Thresholds were determined as in Experiment 1, and the results are shown in Figure 4. As before, mismatch thresholds are significantly greater than match thresholds. 
Figure 4
 
Thresholds in Experiment 2. Normalized thresholds during ambiguous rotation are shown in the same format as that in the previous figure. The raw thresholds from which they were calculated are shown in Supplementary Figure S4. The stimuli here were the same as in Experiment 1 except that the inducing stimulus contained no binocular disparity. In this case, match thresholds are better than control values while mismatch thresholds, in the main, are close to control.
Figure 4
 
Thresholds in Experiment 2. Normalized thresholds during ambiguous rotation are shown in the same format as that in the previous figure. The raw thresholds from which they were calculated are shown in Supplementary Figure S4. The stimuli here were the same as in Experiment 1 except that the inducing stimulus contained no binocular disparity. In this case, match thresholds are better than control values while mismatch thresholds, in the main, are close to control.
Here, however, there is a different balance between facilitation and suppression. The match thresholds all lie below the control level, indicated by the dashed line, and in half the cases the match thresholds are significantly less than control. For most of the mismatch thresholds, by contrast, the difference from the control level is not significant or only marginally so. It seems therefore that while the test response is again influenced by the inducing stimulus, the ambiguity of the inducing stimulus facilitates the match response with less effect on the mismatch response. This difference with the previous results will be taken up in the Discussion section. 
Experiment 3: Translation
We have shown that sensitivity to the slant of a test surface depends on the properties of the rotating structure on which the test is superimposed. How general is this result? In particular, does the inducing surface have to be perceived as rotating in order to influence a nearby test surface? To perceive an array of dots as rotating, it is necessary to see an area with dots moving at the middle and lacking movement at the sides. The question can therefore be rephrased as follows. Is the interaction between a test surface and an inducing structure also present when the speeds are uniform and no rotation is perceived? If the answer were yes, it would suggest that interactions between opposing motions at the same depth could arise without rotation or shape cues. 
Several studies (Gibson, Gibson, Smith, & Flock, 1959; Ono, Rivest, & Ono, 1986; Rogers & Graham, 1979) have shown that two textured surfaces moving relative to each other can induce the perception that the surfaces are at differing depths. The interpretation of these studies is complicated, however, by the presence of cues other than stimulus translation—perspective in one case (Gibson et al., 1959), vestibular input and shape in the other two. Watanabe (1999) avoided these complications by displaying two randomly distributed dot arrays on a video monitor and moving the two arrays in opposite directions. Subjects saw the two arrays at differing depths using relative translation as their only cue. We used a similar approach to Watanabe's, with the stimulus shown in Figure 1B. The inducing stimulus in this case consisted of two dot arrays, one moving uniformly to the left and the other to the right, and both arrays had zero binocular disparity. The two arrays appeared to be at different depths. Moreover, the inducing stimulus was ambiguous because there was no binocular disparity in the physical stimulus. The leftward-moving dots appeared to be nearer for a few seconds, then further away, in a never-ending cycle. The reader can observe this effect in Supplementary Figure S2
The test stimulus in Experiment 3 was much the same as that used in Experiments 1 and 2 and the protocol was identical to that in Experiment 2. The subject waited until the nearer surface of the inducing stimulus was moving in the required direction, triggered a test stimulus, and indicated whether the top or bottom of the test was nearer. The results are shown in Figure 5. The Match label indicates, as usual, that the test stimulus moved in the same direction as the inducing surface to which it was closer, and Mismatch indicates that the test stimulus moved in the opposite direction. As before, thresholds for test slant are higher in the mismatch case than in the match case. Further, almost all of the match thresholds lie significantly below the control level, indicating facilitation by the inducing stimulus, while most of the mismatch thresholds do not differ significantly from the control level. 
Figure 5
 
Thresholds in Experiment 3. Thresholds obtained with the translating inducing stimulus are shown with the same format as that used in Figure 3. The raw thresholds from which these normalized values were calculated are shown in Supplementary Figure S5. Match thresholds are significantly better than control values while, in general, mismatch thresholds do not differ from control.
Figure 5
 
Thresholds in Experiment 3. Thresholds obtained with the translating inducing stimulus are shown with the same format as that used in Figure 3. The raw thresholds from which these normalized values were calculated are shown in Supplementary Figure S5. Match thresholds are significantly better than control values while, in general, mismatch thresholds do not differ from control.
Discussion
A rotating object provides multiple cues for the perception of its structure, including binocular disparity, curvature, and motion. In three experiments we have progressively stripped away some of these cues to show that motion alone is sufficient to produce a perception of depth. Further, we have shown that this perception results, at least in part, from suppression between the responses to opposing motions at the same depth. A summary of our results, averaged across test stimulus locations and across subjects, is shown in Figure 6. In what follows we describe previous evidence for suppressive interactions, discuss the interplay between suppression and facilitation, and point out the potent role of relative motion in producing depth percepts. 
Figure 6
 
Mean thresholds. Relative thresholds in Experiments 1, 2, and 3 are shown on the left, middle, and right parts of the graph, respectively. Thresholds are averaged across test locations (front and back surfaces) and subjects. Error bars show 95% confidence intervals: each interval was calculated from the 8 observations (2 test locations × 4 subjects) from which the mean was obtained.
Figure 6
 
Mean thresholds. Relative thresholds in Experiments 1, 2, and 3 are shown on the left, middle, and right parts of the graph, respectively. Thresholds are averaged across test locations (front and back surfaces) and subjects. Error bars show 95% confidence intervals: each interval was calculated from the 8 observations (2 test locations × 4 subjects) from which the mean was obtained.
Previous evidence for suppression
There appears to be general agreement among previous studies that there is mutual suppression between responses to motions in opposite directions. Mather and Moulden (1983) drifted one dot array upward and a second downward. The dot luminance required to detect any motion was higher than that when only one dot array, and therefore one motion direction, was presented. Lindsey and Todd (1998), using checkerboards in which each check was randomly assigned one of two contrasts, drifted two checkerboards in a variety of relative directions. Subjects were worst at identifying a motion direction when the component motions were in the opposite directions. 
Qian, Andersen, and Adelson (1994) provided evidence as to how this suppression comes about. They drifted two arrays of dots in opposite directions. Each dot in one array was paired with a dot in the other array by turning it on only when it was close to its partner. The authors noted that motion transparency was not evident when dots were paired: the display looked “more like flicker”. One dot array was then given a binocular disparity differing from that of the other array, so that the two arrays appeared to be at differing depths. Such patterns translated over each other in a transparent fashion, much more so than was the case when they were viewed without disparity. Our results are in agreement with previous work, therefore, in finding that motion perception in the presence of opposing motions at the same depth is worse than that for like motions or for opposing motions at differing depths. 
Suppression versus facilitation
We have shown here an interaction between the responses to two adjacent moving objects and have labeled the interactions as either suppressive or facilitatory. The justification for these labels is given as follows. The control consisted of a test stimulus located at the fixation plane and moving left or right. There was no significant difference between the responses in the two directions in 9 of 12 (3 experiments × 4 subjects) cases, and the significance was marginal in the remaining 3 cases. The control is therefore useful in classifying interactions in that responses to it are either uninfluenced, or equally influenced, by the two surfaces of the inducing stimulus. When the test lies close to one of the inducing stimulus surfaces, its threshold response lies either above or below the control value and we have accordingly labeled these responses as suppressed or facilitated, respectively. 
Bradley et al. (1995) measured responses in MT cells stimulated with two dot arrays, one of which moved in the opposite direction from the other. Their control stimulus consisted of a single dot array moving in the preferred direction of the cell under study. Impulse rates in response to the two arrays fell below the control value when the two arrays were both at the preferred binocular disparity of the cell. This suppression could well account for the suppressive interaction we measured in Experiment 1 when the test is close to an inducing surface moving in the opposite direction. In Experiments 2 and 3, which used ambiguous stimuli, the results differed in that a test stimulus moving in the opposite direction to an inducing stimulus surface tended to have a response similar to the control level, while test stimuli moving in the same direction as the inducing stimulus resulted in thresholds better than control. This shift in the balance between suppression and facilitation is illustrated in Figure 6. Is there physiological evidence for this facilitation? Bradley et al., in their Figure 1, show one cell whose impulse rate rises above the control level when the two dot arrays are presented at differing binocular disparities, but their population data (Figure 2) show that facilitation is the exception rather than the rule. 
It is possible that neurons in areas beyond MT in the motion-processing hierarchy are responsible for the facilitation seen with ambiguous stimuli. Williams, Elfar, Eskandar, Toth, and Assad (2003) showed that neuronal activity in areas MST and LIP is better correlated with behavioral reports about ambiguous stimuli than is activity in area MT. These authors, however, did not explicitly compare neuronal responses to motions in the same and opposite directions. The contribution to facilitation of areas beyond MT therefore remains an open question. 
What, then, is the source of the facilitation we have measured? Given that the facilitation was limited to responses to ambiguous stimuli, it is useful to note that several models of ambiguous perception provide a mechanism for facilitation. Models of structure-from-motion (Bradley, Chang, & Andersen, 1998; Nawrot & Blake, 1991) and binocular rivalry (Freeman, 2005; Lumer, 1998) assume mutual inhibition between cell populations tuned to opposite polarities (of motion direction and contour orientation, respectively) of the stimulus. When one population's activity is elevated, it inhibits the opponent population. The opponent population in turn inhibits the dominant population less, resulting in a dominant response that rises above the control level. It could be, therefore, that the facilitation we have measured arises from disinhibition. 
There is an alternative explanation for the facilitation. In order to measure the interaction between test and inducing stimuli, the test was given a binocular disparity equal to 3 min in front of or behind the fixation plane. The ambiguous inducing stimuli, however, do not necessarily lie at these depths. It is possible, therefore, that Experiments 2 and 3 yielded more facilitation than Experiment 1 because the test differed in depth from both surfaces of the inducing stimulus. To test for this possibility, we ran a depth-matching experiment, described in the Supplementary material. The result, shown in Supplementary Figure S6B, is that the inducing stimulus surfaces in Experiments 2 and 3 differed in depth from the fixation plane by an average of 2.3 and 0.8 min, respectively. This means that the test and two inducing surfaces in these experiments were all at different depths. When the test was located nearer than the fixation plane in Experiment 3, for example, the test had a crossed disparity of 3 min and the near and far inducing surfaces appeared to have depths of about 0.8 min crossed and 0.8 min uncrossed, respectively. 
There are at least two possible ways in which this arrangement of surfaces could lead to enhanced facilitation. First, the two inducing surfaces are seen as being closer to the control stimulus than was the case in Experiment 1. Perhaps the control threshold is elevated by the inducing stimulus that moves in the opposite direction, producing an apparent improvement in the normalized threshold obtained for test stimuli not in the fixation plane. This possibility can be ruled out by computing mean control thresholds in the three experiments: control threshold averaged 1.18, 1.13, and 1.23 min in Experiments 1, 2, and 3, respectively. There is no trend here that can explain the increasing facilitation seen across the three experiments. The second possibility depends on the finding that the test stimulus in Experiments 2 and 3 differs in depth from the nearest inducing surface: it could be that there is facilitation between surfaces moving in opposite directions at differing depths. We have already argued, however, that the physiological findings of Bradley et al. (1995) render this unlikely. We return, therefore, to the idea that increasing facilitation in Experiments 1, 2, and 3 is the result of disinhibition. 
Translation
The kinetic depth effect was originally defined as the perception of three-dimensional structure in the two-dimensional projection of a rotating object (Wallach & O'Connell, 1953). Watanabe (1999), however, showed that the concept could be extended to two transparent surfaces translating in opposite direction: subjects perceived one surface as being closer than the other. This is an unexpected observation: opposing motions induce a depth percept in the absence of binocular disparity, texture gradients, or other shape cues. We have determined the strength of this effect by adding matching and mismatching test stimuli to the inducing stimulus. Figure 6 shows that the ratio of the match to the mismatch threshold is 0.58, 0.64, and 0.73, for Experiments 1, 2 and 3, respectively. Measured in this way, the interaction between opposing motions devoid of other cues is not very different from the interaction when disparity and shape cues are present. This result shows that relative motion can provide a potent cue for depth in the absence of other cues. 
Supplementary Materials
Figure S1 - Figure S1 
Figure S1. A movie simulating the inducing stimulus used in Experiment 2. It is typically perceived as a rotating cylinder for which the front surface is moving either leftward or rightward. 
Figure S2 - Figure S2 
Figure S2. A movie simulating the inducing stimulus used in Experiment 3. It is typically perceived as two transparent planes translating in opposite directions and at differing depths. The near plane is seen as moving either leftward or rightward. 
Figure S3 - Figure S3 
Figure S3. Raw test thresholds obtained in Experiment 1. This figure gives the data from which the normalized thresholds of Figure 3 were calculated. Thresholds differed between subjects: the vertical scales have been adjusted accordingly. 
Figure S4 - Figure S4 
Figure S4. Raw test thresholds obtained in Experiment 2. This figure gives the data from which the normalized thresholds of Figure 4 were calculated. 
Figure S5 - Figure S5 
Figure S5. Raw test thresholds obtained in Experiment 3. This figure gives the data from which the normalized thresholds of Figure 5 were calculated. 
Figure S6 - Figure S6 
Figure S6. Experiment 4: depth matching. (A) Stimulus sequence used to find the disparity-induced depth that matched a motion-induced depth. There were three stimuli on each trial. The first, an inducing stimulus, had motion but zero binocular disparity. The second, the test stimulus, was stationary with variable disparity. The third, the mask, was designed to prevent percepts on a given trial from interfering with responses on the following trial. On each trial, subjects were required to indicate whether the disparity-induced depth of the test stimulus was more or less than the motion-induced depth of the inducing stimulus. (B) The vertical axes gives the binocular disparity yielding a depth percept matching the motion-induced depth of the stimulus depicted on the horizontal axis. Results from individual subjects are shown on the left, means across subjects are shown on the right. 
Acknowledgments
We thank Colin Clifford for his very useful comments on an earlier version of this paper. 
Commercial relationships: none. 
Corresponding author: Alan Freeman. 
Email: A.Freeman@usyd.edu.au. 
Address: University of Sydney, P.O. Box 170, Lidcombe, NSW 1825, Australia. 
References
Bradley, D. C. Chang, G. C. Andersen, R. A. (1998). Encoding of three-dimensional structure-from-motion by primate area MT neurons. Nature, 392, 714–717. [PubMed] [CrossRef] [PubMed]
Bradley, D. C. Qian, N. Andersen, R. A. (1995). Integration of motion and stereopsis in middle temporal cortical area of macaques. Nature, 373, 609–611. [PubMed] [CrossRef] [PubMed]
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. [PubMed] [CrossRef] [PubMed]
Freeman, A. W. (2005). Multistage model for binocular rivalry. Journal of Neurophysiology, 94, 4412–4420. [PubMed] [Article] [CrossRef] [PubMed]
Gibson, E. J. Gibson, J. J. Smith, O. W. Flock, H. (1959). Motion parallax as a determinant of perceived depth. Journal of Experimental Psychology, 58, 40–51. [PubMed] [CrossRef] [PubMed]
Iyer, P. Freeman, A. W. (2006). Suppressive interactions between opposing motions in structure-from-motion.
Lindsey, D. T. Todd, J. T. (1998). Opponent motion interactions in the perception of transparent motion. Perception & Psychophysics, 60, 558–574. [PubMed] [CrossRef] [PubMed]
Lumer, E. D. (1998). A neural model of binocular integration and rivalry based on the coordination of action-potential timing in primary visual cortex. Cerebral Cortex, 8, 553–561. [PubMed] [Article] [CrossRef] [PubMed]
Mather, G. Moulden, B. (1983). Thresholds for movement direction: Two directions are less detectable than one. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 35, 513–518. [PubMed] [CrossRef]
Nawrot, M. Blake, R. (1989). Neural integration of information specifying structure from stereopsis and motion. Science, 244, 716–718. [PubMed] [CrossRef] [PubMed]
Nawrot, M. Blake, R. (1991). A neural network model of kinetic depth. Visual Neuroscience, 6, 219–227. [PubMed] [CrossRef] [PubMed]
Ono, M. E. Rivest, J. Ono, H. (1986). Depth perception as a function of motion parallax and absolute-distance information. Journal of Experimental Psychology: Human Perception and Performance, 12, 331–337. [PubMed] [CrossRef] [PubMed]
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed] [CrossRef] [PubMed]
Qian, N. Andersen, R. A. Adelson, E. H. (1994). Transparent motion perception as detection of unbalanced motion signals I Psychophysics. Journal of Neuroscience, 14, 7357–7366. [PubMed] [Article] [PubMed]
Rogers, B. Graham, M. (1979). Motion parallax as an independent cue for depth perception. Perception, 8, 125–134. [PubMed] [CrossRef] [PubMed]
Wallach, H. O'Connell, D. N. (1953). The kinetic depth effect. Journal of Experimental Psychology, 45, 205–217. [PubMed] [CrossRef] [PubMed]
Watanabe, K. (1999). Optokinetic nystagmus with spontaneous reversal of transparent motion perception. Experimental Brain Research, 129, 156–160. [PubMed] [Article] [CrossRef] [PubMed]
Watson, A. B. Pelli, D. G. (1983). QUEST: A Bayesian adaptive psychometric method. Perception & Psychophysics, 33, 113–120. [PubMed] [CrossRef] [PubMed]
Williams, Z. M. Elfar, J. C. Eskandar, E. N. Toth, L. J. Assad, J. A. (2003). Parietal activity and the perceived direction of ambiguous apparent motion. Nature Neuroscience, 6, 616–623. [PubMed] [CrossRef] [PubMed]
Figure 1
 
(A) Stimuli in Experiments 1 and 2. The inducing stimulus consisted of an array of randomly positioned black dots moving horizontally in simple harmonic motion. The test stimulus, added briefly to the existing dots, consisted of a transparent vertical strip of dots moving leftward or rightward across the midline of the inducing stimulus. Subjects perceived the inducing stimulus as a cylinder whose front surface moved either left or right. The test stimulus was given a base binocular disparity that placed it close to either the front or back surface of the virtual cylinder, and a slant disparity that made the top of the test appear nearer or further than the bottom. On each trial, subjects were required to indicate whether the top or bottom appeared nearer. (B) Stimuli in Experiment 3. In this case the inducing stimulus consisted of one array of dots translating leftward and another array translating rightward. Subjects perceived one array to be nearer than the other. The test stimulus was very similar to that used in Experiments 1 and 2.
Figure 1
 
(A) Stimuli in Experiments 1 and 2. The inducing stimulus consisted of an array of randomly positioned black dots moving horizontally in simple harmonic motion. The test stimulus, added briefly to the existing dots, consisted of a transparent vertical strip of dots moving leftward or rightward across the midline of the inducing stimulus. Subjects perceived the inducing stimulus as a cylinder whose front surface moved either left or right. The test stimulus was given a base binocular disparity that placed it close to either the front or back surface of the virtual cylinder, and a slant disparity that made the top of the test appear nearer or further than the bottom. On each trial, subjects were required to indicate whether the top or bottom appeared nearer. (B) Stimuli in Experiment 3. In this case the inducing stimulus consisted of one array of dots translating leftward and another array translating rightward. Subjects perceived one array to be nearer than the other. The test stimulus was very similar to that used in Experiments 1 and 2.
Figure 2
 
Responses from one subject in Experiment 1. (A) The horizontal axis shows the slant disparity of the test stimulus, measured as the difference in binocular disparities of the top and bottom of the stimulus. The vertical axis shows the probability that the subject correctly indicated whether the top or bottom was nearer. The left and middle graphs give the results when the test was positioned close to the front and back surfaces of the virtual cylinder, respectively. Open circles indicate that the test stimulus moved in the direction matching the cylinder surface to which it was closest, and filled circles indicate a mismatch in direction. The subject required a higher slant disparity to respond correctly in the mismatch case. The right graph indicates the control case in which the test was positioned at the middle of the virtual cylinder. Leftward and rightward movements of the test produce very similar responses. For all three graphs, the curve represents a cumulative Gaussian function fitted to the data. (B) A threshold was calculated from each fitted psychometric function in part A of the figure by finding the slant disparity at which the probability of a correct response was 75%. Error bars indicate 95% confidence intervals. (C) The thresholds in part B of the figure are replotted here normalized by the mean of the leftward and rightward control thresholds. Thresholds greater, and less than, the control value are labeled suppression, and facilitation, respectively.
Figure 2
 
Responses from one subject in Experiment 1. (A) The horizontal axis shows the slant disparity of the test stimulus, measured as the difference in binocular disparities of the top and bottom of the stimulus. The vertical axis shows the probability that the subject correctly indicated whether the top or bottom was nearer. The left and middle graphs give the results when the test was positioned close to the front and back surfaces of the virtual cylinder, respectively. Open circles indicate that the test stimulus moved in the direction matching the cylinder surface to which it was closest, and filled circles indicate a mismatch in direction. The subject required a higher slant disparity to respond correctly in the mismatch case. The right graph indicates the control case in which the test was positioned at the middle of the virtual cylinder. Leftward and rightward movements of the test produce very similar responses. For all three graphs, the curve represents a cumulative Gaussian function fitted to the data. (B) A threshold was calculated from each fitted psychometric function in part A of the figure by finding the slant disparity at which the probability of a correct response was 75%. Error bars indicate 95% confidence intervals. (C) The thresholds in part B of the figure are replotted here normalized by the mean of the leftward and rightward control thresholds. Thresholds greater, and less than, the control value are labeled suppression, and facilitation, respectively.
Figure 3
 
Thresholds in Experiment 1. Thresholds during unambiguous rotation were measured as shown in Figure 2. The vertical axis shows normalized threshold, obtained by dividing each value by the threshold obtained when the test was located in the middle of the virtual cylinder. Raw thresholds are shown in Supplementary Figure S3. Data for the four subjects are shown, as listed on the horizontal axis. The upper graph is for test stimuli close to the cylinder's front surface and the lower graph for the back surface. Open columns indicate that the test moved in the same direction as the cylinder surface to which it was closest, and shaded columns are for movement in opposite directions. Error bars indicate the 95% confidence intervals. In general, match thresholds do not differ significantly from the control level whereas mismatch thresholds are elevated.
Figure 3
 
Thresholds in Experiment 1. Thresholds during unambiguous rotation were measured as shown in Figure 2. The vertical axis shows normalized threshold, obtained by dividing each value by the threshold obtained when the test was located in the middle of the virtual cylinder. Raw thresholds are shown in Supplementary Figure S3. Data for the four subjects are shown, as listed on the horizontal axis. The upper graph is for test stimuli close to the cylinder's front surface and the lower graph for the back surface. Open columns indicate that the test moved in the same direction as the cylinder surface to which it was closest, and shaded columns are for movement in opposite directions. Error bars indicate the 95% confidence intervals. In general, match thresholds do not differ significantly from the control level whereas mismatch thresholds are elevated.
Figure 4
 
Thresholds in Experiment 2. Normalized thresholds during ambiguous rotation are shown in the same format as that in the previous figure. The raw thresholds from which they were calculated are shown in Supplementary Figure S4. The stimuli here were the same as in Experiment 1 except that the inducing stimulus contained no binocular disparity. In this case, match thresholds are better than control values while mismatch thresholds, in the main, are close to control.
Figure 4
 
Thresholds in Experiment 2. Normalized thresholds during ambiguous rotation are shown in the same format as that in the previous figure. The raw thresholds from which they were calculated are shown in Supplementary Figure S4. The stimuli here were the same as in Experiment 1 except that the inducing stimulus contained no binocular disparity. In this case, match thresholds are better than control values while mismatch thresholds, in the main, are close to control.
Figure 5
 
Thresholds in Experiment 3. Thresholds obtained with the translating inducing stimulus are shown with the same format as that used in Figure 3. The raw thresholds from which these normalized values were calculated are shown in Supplementary Figure S5. Match thresholds are significantly better than control values while, in general, mismatch thresholds do not differ from control.
Figure 5
 
Thresholds in Experiment 3. Thresholds obtained with the translating inducing stimulus are shown with the same format as that used in Figure 3. The raw thresholds from which these normalized values were calculated are shown in Supplementary Figure S5. Match thresholds are significantly better than control values while, in general, mismatch thresholds do not differ from control.
Figure 6
 
Mean thresholds. Relative thresholds in Experiments 1, 2, and 3 are shown on the left, middle, and right parts of the graph, respectively. Thresholds are averaged across test locations (front and back surfaces) and subjects. Error bars show 95% confidence intervals: each interval was calculated from the 8 observations (2 test locations × 4 subjects) from which the mean was obtained.
Figure 6
 
Mean thresholds. Relative thresholds in Experiments 1, 2, and 3 are shown on the left, middle, and right parts of the graph, respectively. Thresholds are averaged across test locations (front and back surfaces) and subjects. Error bars show 95% confidence intervals: each interval was calculated from the 8 observations (2 test locations × 4 subjects) from which the mean was obtained.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×