To perceive object motion when the eyes themselves undergo smooth movement, we can either perceive motion directly—by extracting motion relative to a background presumed to be fixed—or through compensation, by correcting retinal motion by information about eye movement. To isolate compensation, we created stimuli in which, while the eye undergoes smooth movement due to inertia, only one object is visible—and the motion of this stimulus is decoupled from that of the eye. Using a wide variety of stimulus speeds and directions, we rule out a linear model of compensation, in which stimulus velocity is estimated as a linear combination of retinal and eye velocities multiplied by a constant gain. In fact, we find that when the stimulus moves in the same direction as the eyes, there is little compensation, but when movement is in the opposite direction, compensation grows in a nonlinear way with speed. We conclude that eye movement is estimated from a combination of extraretinal and retinal signals, the latter based on an assumption of stimulus stationarity. Two simple models, in which the direction of eye movement is computed from the extraretinal signal and the speed from the retinal signal, account well for our results.

**v**), and the eyes are engaged in smooth motion with a given velocity (vector

**e**), the direction of the retinal projection of stimulus motion is the vector difference:

*direct*and an

*indirect*theory. Gibson (1966), with the

*direct*theory, proposed that the physical motion of objects can simply be derived from the optic flow: the visual system would subtract out any uniform motion in the flow field, perceiving only relative motion between stimulus and background. This direct image-based compensation has also been used to explain spatial constancy in the context of micro eye movements (Murakami & Cavanagh, 1998). The

*indirect*theory, put forward by von Helmholtz (1867), has been more influential: it assumes that the visual system has some information about eye movement derived from a copy of the motor command sent to the eye muscles. Spatial stability would involve combining the retinal signal with this extraretinal eye velocity estimate (see also Sperry, 1950; von Holst & Mittelstaedt, 1950). In the domain of motion perception, in geometrical terms, it would consist of adding a vector, parallel to the eye velocity, to the retinal stimulus velocity vector. If the eye velocity were estimated correctly, as shown in Figure 1B, the perceived direction would be the real direction of the stimulus in the world. On the other hand, if the eye velocity were underestimated, as concluded by several studies (Aubert, 1887; Blohm, Missal, & Lefèvre, 2005; Filenhne, 1922; Fleischl, 1882; Mack & Herman, 1973), the perceived direction would lie between the retinal direction and the direction in space, as shown in Figure 1C. If eye velocity is under- or overestimated by a constant factor,

*κ,*we obtain the following model:

*linear*in the literature, but we will call it the zeroth-order model because

*κ*is a constant.

*indirect*theory, the eye velocity estimate is derived only from extraretinal sources. However, as Brenner and van den Berg (1994) pointed out, relative motion between the pursuit target and the background could also be used to estimate the eye velocity. Such a relative-motion-based estimate would be correct only if a stationary object—stationary

*in the world*—is taken as a reference. In natural conditions, the background, during pursuit, is often composed of large objects that are stationary, and, as indicated by the Duncker illusion, the visual system in fact tends to perceive large objects as stationary. In this illusion, a small stationary dot surrounded by a moving frame will be erroneously perceived as moving in the direction opposite to the motion of the larger frame (Duncker, 1929). In Brenner and van den Berg's (1994) study, ocular pursuit was performed in front of a textured background, and the perceived target velocity was influenced by the relative motion between the target and the background in the way predicted by an assumption of background stationarity. This shows that, when a background is present, the eye velocity estimate derives, at least partly, from visual information. We will discuss this point further below.

*κ*= 0.24 in terms of the zeroth-order model). However, Mack and Herman (1978) showed that this constancy loss actually resulted from the influence of relative motion on motion perception. In their study, the target was switched off during the presentation of the moving stimulus while the eyes, already engaged in pursuit, continued their smooth movement. They showed that if the target was eliminated during the presentation of the stimulus, the constancy loss was only 26% (

*κ*= 0.74) for a stimulus presentation of 200 ms. They attributed the high constancy loss found by Stoper to contamination by relative position or motion changes between the target and the stimulus.

*target*—a Gaussian blob with width of 0.17°, on the left or right side of the screen (eccentricity 10°). Subjects were instructed to fixate the stationary target and to press a mouse button when ready. This initiated the pursuit target movement rightward or leftward (depending on whether it appeared on the left or right side of the screen, respectively), with its speed increasing from rest to 20°/s at constant acceleration of 40°/s

^{2}; this acceleration phase lasted 0.5 s, during which the target advanced 5°. Upon reaching the speed of 20°/s, the target continued moving at this constant speed. When the target's position reached a randomly chosen value between 5 and 10° on the opposite side of the screen center from where it began (at this stage, it was always moving at constant speed of 20°/s), the dot changed direction, became brighter, moved at constant velocity for 100 ms, and then disappeared. We refer to the dot on this second phase as the

*stimulus*.

*direction*of stimulus motion by orienting a line whose starting point corresponded to the beginning of the stimulus trajectory and whose endpoint was controlled by a computer mouse. For the subject's reference, the target's trajectory was displayed during the response phase as a dotted line. When satisfied with his or her answer, the subject pressed a mouse button.

^{2}. Trials were also discarded if the horizontal speed during the stimulus phase was below 5°/s or over 35°/s (in practice, the upper bound was never encountered) or if the vertical eye speed was over 5°/s. Approximately 16% of trials from the main experiment were discarded due to these conditions.

*all*stimuli had a predictable upward component. There seems to be no effect of the speed of the stimulus on that of the eye (

*R*= 0.006, n.s.). The corresponding distribution of the retinal velocities is shown in Figure S2B.

**v**, we had a response vector

**R**, obtained by dividing the reported trajectory vector by the duration of the stimulus, 100 ms. We assumed that there was no left–right asymmetry, and so in order to increase the statistical power of our data, we “folded” responses along the vertical axis: when

*v*

_{ x}was negative, we performed the transformation

**v**→ (−

*v*

_{ x},

*v*

_{ y}) and

**R**→ (−

*R*

_{ x},

*R*

_{ y}). These data are shown in Figure S3. The results show a distortion between the physical velocities and the velocities reported by the subjects. This could be the result of a perceptual or response bias such as a range effect (Poulton, 1973) or spatial anisometropy (Post & Chaderjian, 1987).

**v**that we used in the control experiment but at other points in velocity space, we will need an interpolating function,

**R**(

**v**) ≈ (

*B*

_{ x}(

*v*

_{ x},

*v*

_{ y}),

*B*

_{ y}(

*v*

_{ x},

*v*

_{ y})). We fitted

*B*

_{ x}and

*B*

_{ y}as third-order polynomials in

*v*

_{ x},

*v*

_{ y}, with the plausible constraints that horizontal and vertical motions are perceived as perfectly horizontal and vertical, which removed several terms in the polynomials. The results of these fits, for the data of all four subjects taken together, are shown in Figure S3 alongside the averaged data. For all data taken together, the goodness-of-fit measures of the model for the two components are

*R*

_{ x}

^{2}= 0.92 and

*R*

_{ y}

^{2}= 0.91. We could have obtained a better fit by going to higher order, but we did not want to overfit the data. When analyzing individual subjects (see below), we also performed these fits on the individual subjects' data.

*κ,*the extraretinal gain, would be close to or more than 1. On the other hand, this gain decreases toward 0 when the stimulus moves faster and faster in the direction of eye movement. The spatial and retinal angles of motion are plotted in Figure 4 (and given for each stimulus velocity in Figure S5, note that stimulus with the same spatial direction but different speeds have different directions on the retina, as explained in Figure S1).

*κ*= 0), the response would just be equal the retinal angle, shown as a dashed red line. The middle graph plots the screen angle as a function of retinal angle for each trial: thus, this graph shows what responses in the left graph would look like in the case of perfect compensation (

*κ*= 1). Finally, the rightmost graph shows data obtained in the control (fixation) condition. The asymmetry of compensation between forward and backward moving stimuli is quite visible here. For motion directions close to 0° (forward motion), responses seem to be on the diagonal and thus totally uncompensated. On the other hand, as the angle increases past 90° and approaches 180° (backward motion), the responses systematically sink below the diagonal; comparing with the middle graph, we see that they approach the compensated or spatial directions. Although part of this might be due to an oblique-like response bias, which is quantified in the rightmost graph of Figure 4, the magnitude of the effect seems to be beyond anything predicted by a simple response bias. In the following analyses, we will attempt to quantify these impressions.

*κ*independently for each stimulus velocity

**v**, using a least mean square method and calibrating with the fitted control data. In detail, for each trial

*i*corresponding to a given target velocity

**v**in space, we calculated a cost function for a given value of

*κ,*by first calculating the unbiased prediction from the zeroth-order model, namely

**r**

_{ i}+

*κ*

**e**

_{ i}, where the retinal velocity

**r**

_{ i}=

**v**−

**e**

_{ i}and

**e**

_{ i}is the measured eye velocity on that trial. We then applied the bias polynomial computed from the corresponding control data to obtain the biased response,

**B**(

**r**

_{i}+

*κ*

**e**

_{ i}), and calculated the direction of this predicted response. We then calculated the sum of squared differences between the predicted and actual response (

*θ*

_{ i}) directions, and finally, calculated

*κ*by minimizing this sum of squares:

**r**(

**v**), in order to express

*κ*as a function of -

**r**; the variability in

**r**being due to variable eye velocities on different trials.

*κ*as a function of retinal velocity are shown in Figure 5. The above calculations were performed separately for each subject (including the fitted calibration by control data), and for all subjects pooled together (

*R*

^{2}= 0.970). The data plotted in this figure are presented in Figure S5. This table also shows the predictions of this model, to allow the reader to compare the fit to the data.

*κ*is uniform, i.e., the fitted curves should be flat. This is clearly not the case: in all subjects,

*κ*seems to have a strong dependence on

*r*

_{x}, decreasing nonlinearly as

*r*

_{x}increases, then flattening out and then perhaps rising again. This is a more quantitative measure of the difference between forward and backward motions than was discussed above and shown in Figure 3. In some subjects, there also seems to be an effect of

*r*

_{y}. Standard errors for the

*κ*estimates are shown in Figure S6.

*κ*on

**r**—we fitted

*κ*(

**r**) as a second-order polynomial:

*a*

_{i,j}) are rather unstable and are best avoided. For purposes of robustness, we therefore segregate the nonlinear fit to a single parameter,

*κ,*for each value of

**v**, and then perform a linear fit ( Equation 4) on multiple parameters.)

*a*

_{i,j}by bootstrap (Efron & Tibshirani, 1994), in order to determine if they are statistically different than zero. To perform this analysis, we generated random, independent bootstrap resamples of the data, generating samples with the same number of trials as the actual data by selecting trials randomly with replacement. For each bootstrap resample, we calculated

*κ*(

**r**) (Equation 3), and then performed the fit to calculate

*a*

_{i,j}(Equation 4). The standard deviations of the bootstrap values of

*a*

_{i,j}provide estimates for standard errors for the values calculated on the actual data; for further details of the bootstrap technique, see Efron and Tibshirani (1994).

*r*

_{x}on

*κ,*either through

*r*

_{x}

^{2}(note the small between-subject variation in this coefficient), or through both

*r*

_{ x}

^{2}and

*r*

_{ x}. In addition, one of the authors has a significant

*r*

_{ x}

*r*

_{y}interaction. Therefore, in all subjects, we can exclude the zeroth-order model of compensation for eye movements in motion perception, which predicts uniform

*κ,*and therefore in which only the constant term would be different than zero. In addition, in terms of goodness of fit (presented in the second column of Table 1), the

*R*

^{2}value indicates that the second-order fit accounts for 97% of the variance in the data. All four subjects have a significant dependence of

*κ*on the component of retinal velocity,

*r*

_{ x}, roughly parallel to the eye movement. The significantly positive

*r*

_{ x}

^{2}coefficient (in all subjects) shows that compensation is greater for faster motion in the direction of eye movement. In pooled data, as well as in two individual subjects, we find a significantly negative

*r*

_{ x}coefficient, which indicates an asymmetry in compensation: backward stimulus motion (with respect to eye movement) is compensated more than forward motion.

Subj. | R ^{2} | Const. | r _{ x} | r _{ y} | r _{ x} ^{2} | r _{ x} r _{ y} | r _{ y} ^{2} |
---|---|---|---|---|---|---|---|

CB | 0.92 | 0.28 | −0.018 | 0.016 | 0.00028 | 0.00022 | −0.000025 |

CM | 0.78 | 0.48 | 0.0014 | −0.0093 | 0.00031 | −0.000018 | −0.000017 |

MV | 0.76 | 0.75 | −0.0045 | −0.029 | 0.00036 | −0.00013 | 0.00046 |

MW | 0.93 | 0.33 | −0.012 | −0.010 | 0.00025 | 0.00022 | 0.00014 |

All | 0.97 | 0.40 | −0.0087 | −0.0045 | 0.00033 | 0.000067 | 0.000022 |

*κ*on

*r*

_{ x}.

*κ*on

**r**by a first-order polynomial. The fitting procedure was the same as for the second-order model, the only difference is the absence of the quadratic terms. The

*R*

^{2}of the first-order model is well below that of the second-order model: 0.564 compared to 0.967 for the second-order model. In the first-order model too, there is a significant effect of

*r*

_{x}on

*κ*(coefficient: −0.0157, significantly negative by bootstrap test,

*p*< 0.01), as well as a significant effect of

*r*

_{ y}(coefficient: −0.0043, significantly negative by bootstrap test,

*p*< 0.01) that is lost when fitting the second-order model. The first-order model accounts for 57% more variance than the zeroth-order model, and the second-order model for 97%. Together, those two models—first-order and second-order—show that

*κ*significantly depends on

**r**, and therefore that the zeroth-order model is not sufficient to describe compensation.

*κ*= 0.33 (

*R*

^{2}= 0.942). The

*R*

^{2}is very high because most of the variance is actually captured by the retinal velocity term in the model. Indeed, the perceived direction highly correlates with the retinal direction, probably because of the very brief presentation duration of stimulus, as it has been shown that presentation duration affects the compensation (Souman et al., 2005). The better goodness of fit for the second-order model is shown, first by a slightly higher

*R*

^{2}(0.970) and by the angular error (difference between the response angle predicted by the model and the real response angle): 9° for all trials together as opposed to 5.9° for the fit per categories.

*SD*) in horizontal and −1.77 (1.6

*SD*) in vertical dimensions. During the stimulus phase, the eyes moved on average by 1.27°, which means that, by the end of the stimulus, the eyes caught up with their initial lag. Therefore, at the end of the stimulus phase, the eccentricities of stimuli with the same speeds were similar on average—the correlation between the horizontal eccentricity (

*he*) and the horizontal retinal speed (

*hrs*) is

*r*= 0.96 (

*r*

^{2}= 0.91), and the equation of the fitted line is

*he*= 1.01 + 0.10

*hrs,*the ordinate at the origin results from the retinal error, and the slope roughly equals the duration of the stimulus because we regress positions as a function of speed. Given that the eccentricity depends linearly on the retinal velocity, if subjects were using the final retinal position of the stimulus to judge its motion, there should be no difference between backward and forward moving stimuli. Eccentricity is thus unlikely to account for our effect given the linear dependence of eccentricity on the retinal velocity.

*direction*of motion.

*κ,*defined as the ratio of the estimated eye speed to real eye speed, through Equation 2. In the range of stimulus speeds that have usually been used (around the same speed as that of the eyes), we have found values of

*κ*less than 1, thus showing an underestimation of eye motion, and in rough agreement with previous studies (Becklen et al., 1984; De Graaf & Wertheim, 1988; Festinger et al., 1976; Mack & Herman, 1973; Mack & Herman, 1978; Morvan & Wexler, 2005; Souman et al., 2005; Stoper, 1967, 1973; Swanston & Wade, 1988; Wallach et al., 1985). On the other hand, the compensation gain found for atypical stimulus velocities was far out of the range of the

*κ*classically found. This was highly unexpected given that the compensation gain is commonly assumed to be uniform. On the contrary, our results reveal the

*varying*nature of

*κ*: we have found, in every subject, a significant effect of the retinal velocity along the pursuit axis (

*r*

_{x}) on

*κ,*either as a first-order or second-order polynomial function of

*r*

_{x}.

*backward*stimulus motion (with respect to eye movement) is compensated more than

*forward*motion. The quadratic dependence, with a positive coefficient, implies that

*faster*stimulus motion (in the direction of eye movement) is compensated more than

*slower*motion. In any case, we can exclude a uniform compensation gain as a function of stimulus velocity.

*r*

_{ x}on

*κ*. The other source of information about eye movement would be from stimulus motion. On the assumption of background stationarity, uniform motion on the retina is compatible with equal-and-opposite motion of the eye. Thus, if stimulus velocity on the retina is

**r**, and since there is no other background, according to this retinal information the eye velocity ought to be

**e**

_{est}= −

**r**. However, this retinal information cannot be used in isolation. If it were, subjects would never be able to perceive motion during the stimulus phase, in which only one object appeared on the retina: any retinal slip would be canceled by the equal-and-opposite estimated eye velocity, and perceived stimulus motion would always be null. Our subjects, however, did perceive motion in the stimulus phase. Therefore, other information must have been involved in the estimation of eye velocity.

**e**and a retinal signal −

**r**, and from these the brain computes an estimated eye velocity

**e**

_{est}by

*μ,*which ranges between 0 and 1, might be determined in a Bayesian optimal manner as suggested by Landy, Maloney, Johnston, & Young (1995). Although this type of combination is perfectly plausible, unfortunately we can learn nothing further about it from our data, since we only have direction-of-motion responses. To understand why, consider what happens when we use Equation 5 to estimate eye velocity:

*θ*

_{p}= arctan(

*p*

_{y}/

*p*

_{x}), which is independent of

*μ,*we are unable to calculate

*μ*from our results.

*amplitude*of an eye movement corresponding to a given motor command, this is probably a fairly optimal strategy (the uncertainty on the direction being probably lower than on the amplitude; Krukowski, Piroq, Beutter, Brooks, & Stone, 2003; Schwartz & Lisberger, 1994). Formally, our model computes eye velocity as follows:

*M*

_{1}'s rule for combining retinal and extraretinal signals, we calculated the corresponding response direction from Equations 7 and 2 for each of our actual trials, and repeated the nonlinear fit of Equation 3 to calculate

*κ*as a function of stimulus retinal velocity

**r**. The results, when we plug

*κ*= 0.4 into Equation 2 in order to calculate perceived velocity on each trial (the value of

*κ*we use is close to 0.33, the value we obtain when we fit

*all*data to the zeroth-order model), are shown in Figure 7. Model

*M*

_{1}gives a decent fit to the actual

*κ*surfaces that we calculated, and that were shown in Figure 6. The one feature of model

*M*

_{1}'s prediction that does not seem to match the data is the rise in

*κ*that is too fast when

*r*

_{x}becomes large (Figure 7).

*M*

_{1}is that when extraretinal and retinal estimates of eye velocity—

**e**and −

**r**, respectively—go in opposite directions, it predicts a somewhat nonsensical result. For instance, whether

**v**= 0.5

**e**or

**v**= 1.5

**e**(where

**v**is the velocity of the stimulus in space), it predicts the same result, namely

**e**

_{est}= 0.5

**e**. We can therefore try a more refined but still extremely simple model:

*M*

_{2}assumes the same estimated eye velocity as the previous model when the extraretinal and retinal signals are in the same direction; when they are in opposite directions, it simply assumes that the eyes are still. The results predicted by model

*M*

_{2}applied to our conditions are shown in Figure 8. In contrast to

*M*

_{1}'s predictions, here

*κ*is too flat for

*r*

_{ x}> 0, which probably means that our second assumption probably goes in the right direction but is also a bit too drastic. These toy models are not meant to be the final word in the explanation of our effects: a more systematic treatment, probably in the Bayesian framework, is in order. Nevertheless, models

*M*

_{1}and

*M*

_{2}are encouraging because of their simplicity, physiological and computational plausibility, and decent fit to our data with only one parameter. Although the second-order model presented above provides a reasonable fit for the data, it may turn out that other models—for instance, a piecewise first-order model—may do as well or better. Our point is that the zeroth-order model is not sufficient to account for the nonuniform nature of compensation, which depends on retinal velocity.

*κ*for stimuli moving with different velocities. Souman et al.'s study, in which the target was moving at 10°/s and the stimuli at 3°/s or 8°/s, found higher

*κ*for the slower stimuli, but nevertheless concluded that the zeroth-order model with uniform compensation gain was adequate for describing motion perception during pursuit. It is possible that the difference obtained between the two stimulus speeds is due to different retinal signals.

*backward,*in the opposite direction from that of eye movement, subjects tend to perceive motion in space—i.e., motion on the retina compensated for eye movement. In trials where target motion is forward, on the other hand, subjects tend to perceive uncompensated, retinal motion.

*κ*estimated by fitting the zeroth-order model (Equation 3) to the data by category, RESPpred–response angle predicted by the zeroth-order model.

*κ,*as a function of the x component of the retinal velocity of the stimulus. Each curve plots data for different y-components of the spatial velocity of the stimulus. Results for all subjects pooled together. The error bars represent standard error, calculated using a bootstrap. To perform the bootstrap analysis, we generated 500 random, independent bootstrap resamples of the data, by randomly selecting, with replacement, the same number of trials as in the original data set. For each bootstrap resample, we calculated

*κ*(Equation 3), with the standard deviation of the bootstrap

*κ*providing an estimate for the standard error of the mean

*κ*.