Humans can recover the structure of a 3D object from motion cues alone. Recovery of structure from motion (SFM) from the projected 2D motion field of a rotating object has been studied almost exclusively in one particular condition, that in which the axis of rotation lies in the frontoparallel plane. Here, we assess the ability of humans to recover SFM in the general case, where the axis of rotation may be slanted out of the frontoparallel plane. Using elliptical cylinders whose cross section was constant along the axis of rotation, we find that, across a range of parameters, subjects accurately matched the simulated shape of the cylinder regardless of how much the axis of rotation is inclined away from the frontoparallel plane. Yet, we also find that subjects do not perceive the inclination of the axis of rotation veridically. This combination of results violates a relationship between perceived angle of inclination and perceived shape that must hold if SFM is to be recovered from the instantaneous velocity field. The contradiction can be resolved if the angular speed of rotation is not consistently estimated from the instantaneous velocity field. This, in turn, predicts that variation in object size along the axis of rotation can cause depth-order violations along the line of sight. This prediction was verified using rotating circular cones as stimuli. Thus, as the axis of rotation changes its inclination, shape constancy is maintained through a trade-off. Humans perceive the structure of the object relative to a changing axis of rotation as unchanging by introducing an inconsistency between the perceived speed of rotation and the first-order optic flow. The observed depth-order violations are the cost of the trade-off.

*C*) as the cylinder's depth-to-width ratio (i.e.,

*C*= 1 for a circular cylinder). Figure 2 shows how the subjects' perceived curvature (

*C*

_{obs}), measured by the adjustment procedure, compares with the simulated curvature (

*C*

_{sim}). The ratio

*C*

_{obs}/

*C*

_{sim}does not change significantly with the simulated angle of inclination

*θ*

_{sim}( Figure 2a). The average ratio for the four subjects fluctuates narrowly and unsystematically between 0.83 and 0.88. However,

*C*

_{obs}/

*C*

_{sim}does depend on the simulated curvature ( Figure 2b). In all cases, cylinders were perceived as less curved than simulated, but the difference was greatest for flattened cylinders (

*C*

_{sim}< 1). A two-way repeated measures ANOVA revealed a statistically significant effect of the simulated curvature on

*C*

_{obs}/

*C*

_{sim},

*F*(4, 60) = 15.87,

*p*< .0001. The test revealed no statistically significant effect of the simulated angle of inclination on

*C*

_{obs}/

*C*

_{sim}or of the interaction between simulated angle of inclination and simulated curvature (

*p*> .05 in both cases). Thus, the perceived shape of a simulated object relative to the axis of rotation does not change with the simulated inclination of the axis of rotation. However, this perceived shape is, in general, nonveridical and the extent of the difference between the perceived and the simulated shapes is a function of the simulated shape itself.

*C*

_{obs}/

*C*

_{sim}will be reduced. However, the effect of the occluder is small. A two-way repeated measures ANOVA revealed a statistically significant effect of the occluder on

*C*

_{obs}/

*C*

_{sim},

*F*(1, 6) = 45.02,

*p*= .00053, but no significant effect of dot lifetime or of interactions between dot lifetime and occluder.

*C*> 1; if it moves in the opposite direction, then the object has

*C*< 1. However, we were unable to find a variable obtained from a simple combination of optic-flow parameters that could predict the observed change in the scaling factor with curvature.

*along the line of sight*does change with the angle of inclination (see Equation 1). This is consistent with the results of Loomis and Eby (1988). Recovering the depth along the line of sight is the same as recovering the distance of the surface relative to the axis of rotation only when this axis is frontoparallel. To account for depth recovery in the general case, we need first to assess the perceived inclination of the angle of rotation. This will be done in the next experiment.

*d*along the line of sight can be computed from our data as (see Equation A3)

*θ*

_{sim}and

*θ*

_{obs}are the simulated and perceived angle of inclination of the axis of rotation, respectively. Figure 3b shows this ratio,

*d*, for the three subjects, as a function of

*θ*

_{sim}. Perceived depth becomes a smaller fraction of the simulated depth as the inclination of the axis of rotation deviates from the vertical. The results agree well with those from a previous study (Loomis & Eby, 1988) using elongated ellipsoids.

*Z*

_{0}is the distance of the subject to the axis of rotation, and Δ

*v*and Δ

*Z*are the differences in retinal velocity and depth, respectively, between any two points on the object. The instantaneous velocity field yields no information about Ω and, as a consequence, object shape is recoverable only up to a scale factor in depth. The visual system is thus free to set Ω, which, in general, it does nonveridically, resulting in the nonveridical recovery of the object's shape.

*Z*

_{0}is the distance to the object, Ω is the angular speed of rotation,

*θ*is the perceived angle of inclination of the axis of rotation (0° <

*θ*< 90°) from the frontoparallel plane, and ∂

_{ x}

*v*

_{ y}is the derivative with respect to

*x*(horizontal direction) of the vertical component of the retinal velocity,

*v*

_{ y}. It is assumed here that the axis of rotation lies within the sagittal plane passing through the eye. Equation 5, which implicitly assumes rigidity, gives the difference in depth Δ

*Z*between any two points on the object from their differences in horizontal retinal velocity Δ

*v*

_{ x}and angular vertical position Δ

*y*.

*θ*, nor the speed of rotation, Ω, can be recovered from the instantaneous velocity field. However, using Equation 4, we can reduce the recovered shape from Equation 3 into a one-parameter family of recoverable shapes that differ by a scaling factor in depth ( Equation 5). Thus, for inclined rotations,

*θ*plays the role of the scaling parameter, in the same way as Ω played this role for vertical rotations.

*θ*veridically is expected if they recover SFM only from the instantaneous velocity field. However, this failure to perceive

*θ*veridically, together with the fact that perceived curvature was close to veridical for

*C*

_{sim}≥ 1, violates Equations 3 and 4. This can be seen intuitively by considering that if shape relative to the axis of rotation is recovered almost veridically (for

*C*

_{sim}≥ 1), then the angle of inclination should have been recovered almost veridically, too. Thus, neither the Euclidean nor the affine structure was recovered, although the instantaneous velocity field does carry information about affine structure. The object's shape along the line of sight was recovered with distortions beyond a simple scaling in depth.

*λ*into Equation 3 so as to quantify the nonveridical Ω that is estimated by whatever heuristic subjects use in place of the instantaneous velocity field. Then, from Equation 5, the recovered structure becomes

*λ*can be seen as a scaling factor for depth relative to the slanted plane that contains the axis of rotation and is normal to the sagittal plane. Thus, if

*λ*= 2, all distances to this plane are doubled (see Figure 4b). From Figure 4b, it is easy to see that for

*λ*≠ 1, violations of depth order should be observed: The relative depth of any two points on the bottom half of the object will be perceived as reversed, whereas pairs of points in the top half, where the cylindrical radius is constant, will be perceived with the correct depth order. Thus, a uniform distortion relative to the axis of rotation results in a nonuniform distortion across the image plane, in this case, along the vertical direction. This stands in contrast to the uniform distortions due to depth scaling found in the case of a frontoparallel axis of rotation, where relief structure is always conserved. A depth-order violation when

*λ*≠ 1 can only occur when the shape of the object changes along the axis of rotation, as in the examples shown in Figure 4b. In fact, it can be shown that any value of

*λ*≠ 1 will result in a depth-order violation for any pair of points satisfying certain simple geometric relations.

*y*= constant (i.e., any plane parallel to the

*xz*—horizontal—plane). Thus, parallelism is not violated on these planes. On the other hand, affinity is violated on planes parallel to the

*yz*—vertical—plane, but here, the deformation is such that two lines that have the same inclination in the simulated object will also have the same inclination in the perceived object; the line inclinations in the simulated and in the perceived objects will, in general, differ from each other (see 1).

*θ*and

*λ*. A value of

*λ*≠ 1 implies that the recovered depth structure is not affine. For a frontoparallel axis of rotation, by contrast, we have only one free parameter, namely, angular speed Ω in Equation 2; thus, depth structure can be recovered up to an affine transformation in depth.

*λ*(see Equation A13). Values obtained, shown in Figure 5a, depart greatly from 1.0 for all subjects tested. The minimum value in Figure 5a is 1.99. This result implies large departures from affinity in the recovered depth structure along the line of sight. However, the way in which

*λ*was estimated for Figure 5a is susceptible to any systematic bias that subjects bring to their estimation of

*θ*

_{obs}. A bias, if present, would be strongly amplified when estimating

*λ*because sin

*θ*

_{obs}, which is usually a small quantity, is inversely proportional to

*λ*(see Equation A13).

*λ*, our finding that perceived curvature does not change with inclination clearly suggests that

*λ*values differ from 1.0. A nonunity value of

*λ*is necessary for the object—that is, the perceived structure relative to the axis of rotation—to be independent of the inclination (see Figure 5b).

*λ*= 1 would guarantee affinity but carries the expense of a change in the perceived shape of the object as the axis of rotation changes its inclination. Thus, if subjects perceive the axis of rotation as more vertical than it is (as seems to be the case), then they would perceive the object as flatter than it is.

*λ*= 1, so that the affine structure of the object is recovered. Thus, using a consistent value of Ω would result in a perceived change in object shape when the axis inclination changes. Avoiding changes in perceived shape when inclination changes means using a value of Ω that is inconsistent with the first-order optic flow; that is,

*λ*≠ 1. This comes at the cost of depth-order violations along the line of sight. Figure 5b shows the relationship between the perceived and the simulated object in such a case.

*λ*can be computed from first-order optic-flow quantities. This guarantees that the task of computing

*λ*is something the visual system can, in principle, do.

*λ*that is independent of any measurement of the angle of inclination and more precise than the method used earlier. If subjects perceive surface slant without bias, then, for

*λ*= 1, a surface simulated as vertical must be perceived as vertical regardless of the perceived value of the inclination of the axis of rotation. However, for

*λ*≠ 1, a surface simulated as vertical will be perceived slanted in depth, which implies a depth-order violation (i.e., there exist near-vertical surfaces that would be perceived with the reversed slant). Thus, we can estimate the nonunity value of

*λ*from the slant of the nonvertical simulated surface that appears vertical. Control experiments are needed to measure any intrinsic bias subjects may have in perceiving surface slant.

*λ*≠ 1 ( Equation 6). Any misperception of the surface's slant in this case can be attributed to the subject's intrinsic bias. (Of course, there is still the possibility that the vertical axis condition represents a special case of unbiased perception of surface inclination, with the bias confined to inclined-axis conditions. Thus, the absence of bias in the inclined-axis condition cannot be completely ruled out.)

*p*= .0049), but its direction is opposite that needed to explain the experimental result. Obtained

*λ*values, shown as black bars in Figure 6b, differ significantly from 1.0 (

*p*< .0001) for all four subjects, implying depth-order violations. The direction of this difference—

*λ*exceeds 1.0 in all cases—implies that subjects see the top of the cone as closer than the bottom (i.e.,

*γ*> 0). This perceived shape conflicts with the physical shape of the simulated cone, which had the bottom closer. In the control condition, one subject shows a bias to perceive the top of the cone as far relative to the bottom (i.e.,

*γ*< 0) and the others show a slight tendency in that direction. Thus, the

*λ*values obtained are not an artifact of the subjects' intrinsic biases. Values for

*λ*obtained from this experiment are smaller than those obtained earlier from the inclination matching task ( Figure 5a), although their rank across subjects is conserved (i.e., subject S2 has the largest

*λ*, etc.). Besides differences in task and stimuli, another reason for the smaller

*λ*values is subjects' bias to perceive inclination in the opposite direction to that of the depth-order violation. In addition, as already mentioned, any systematic bias that could be present in the estimation of

*θ*

_{obs}from Experiment 2 would be strongly amplified in estimates of

*λ*in Figure 5a.

*λ*and of the predicted depth-order violations shows that recovery of SFM from the general case of a non-frontoparallel axis of rotation is not even affine.

*f*(def), but they showed that, to be consistent with their psychophysical data, it must be a decreasing function of def.

*a*≃ 5.39. For

*θ*

_{sim}= 0, Equation 7 becomes

*f*(def) is a decreasing function of 〈def〉. If it were a decreasing function of 〈def〉, then the curve in Figure 2b would be a decreasing function of

*C*

_{sim}, which is the opposite of what we found.

*g*(def) is a monotonic increasing function of def. Using Equation 8 for 〈def〉, and assuming

*a*is a constant that depends on the subject, we find a good match (not shown) between our results and the Caudek and Domini prediction. Notice that

*θ*

_{obs}can be obtained exclusively from the optic flow because the numerator in Equation 10 can be (see Equation 4).

*Z*

_{2–3}= Δ

*Z*

_{1–4}because they are joined by similarly shaped surfaces, but Δ

*Z*

_{1–2}< Δ

*Z*

_{4–3}because they are located in planar patches of different slant.

*Z*

_{1–2}< Δ

*Z*

_{4–3}but also that Δ

*Z*

_{2–3}< Δ

*Z*

_{1–4}. Evidence consistent with this latter prediction appears in the data of the Domini and Braunstein (1998). In addition, they found that judged separations differed between axis-in-front and axis-behind conditions. To explain this, they introduced an additional dependence of perceived depth on the velocity ratio between dots. Our model predicts this difference without added assumptions because, for a constant speed of rotation, the depth difference is a function of both relative and absolute retinal speeds (see Equations B12 and B13) and these speeds vary with axis position. In the Domini et al. model, the depth difference is a function of def, which is not a function of axis position.

*I*across a closed path as

*I*= Δ

*Z*

_{1–2}+ Δ

*Z*

_{2–3}− Δ

*Z*

_{4–3}− Δ

*Z*

_{1–4}, where all Δ

*Z*values refer to their absolute (positive) values. Our model predicts that the perceived relative positions of the dots will be, instead, as shown in Figure 7b. This predicts that the integral

*I*′ = Δ

*Z*

_{1–2}− Δ

*Z*

_{2–3}− Δ

*Z*

_{4–3}+ Δ

*Z*

_{1–4}will add up to zero. Our model predicts (substituting

*I*′ = 0 into the definition of

*I*used by Domini & Braunstein, 1998) that the measured value will be

*I*= 2(Δ

*Z*

_{2–3}− Δ

*Z*

_{1–4}) ≠ 0.

*λ*is computed. This gives the depth structure for an arbitrary axis' inclination.

^{2}, respectively. The motion was shown at the refresh rate of the monitor (75 Hz). Stimuli were monocularly viewed at an optical distance of 94 cm, using a chin rest to stabilize head position. No separate fixation point was required for the task, but one was present as a 10′ × 10′ square between trials to guide fixation to the center of the display. Binocular viewing was also tested in pilot experiments and produced no significant differences from monocular viewing.

*n*= 300). Cylinders rotated at a constant angular speed (135 deg/s). The axis of rotation corresponded with the cylinder's longitudinal axis. Rotation was confined to an arc of 50°; one cycle of rotation brought the cylinder to its starting position after rotating a half-cycle of 50° in one direction and another half-cycle of 50° in the opposite direction. Test stimuli were presented in 1.49-s movies. Each movie displayed two cycles of rotation, each cycle formed by playing in forward and reversed order the same 28-frame animation sequence. The axis of rotation could be slanted away from the frontoparallel plane by an angle of 0°, 20°, 40°, 60°, or 80°. Regardless of the slant, the axis of rotation was always within the subjects' midsagittal plane; that is, its projection onto the frontoparallel plane was vertical and centered on the subject's line of sight. The cylinders' cross section was elliptical. The cylinders' rotational path was such that one of the main axes of the ellipse crossed the midsagittal plane halfway through each of the cylinders' half-cycles of rotation (i.e., the starting phase was −25°). Cylinders' curvature was defined as the ratio between the two main axes of the ellipse. In computing the ratios, the axis used in the numerator was the axis that crossed the midsagittal plane during the rotation.

_{max}. Then, at every frame in the movie, we replaced

*n*/lt

_{max}dots (rounded to the nearest integer) with new randomly positioned dots, where

*n*= 300 is the total number of dots. A different set of dots was replaced on each frame until all dots were used, and the process was started again in the same order until the end of the movie. To keep dot lifetimes inside the range of asymptotically perceived depth, lt

_{max}covaried with curvature, being 24, 18, 12, 9, and 6 frames for curvatures of 0.5, 0.75, 1, 1.5, and 2, respectively. These values were low enough at each curvature value to also ensure that no dot-density cues were present, as assessed by judging individual frames, which did not allow the identification of the cylinder's shape.

^{2}, 4° [H] × 2° [V]) to cover these borders. The projected vertical cylinder height between the masks was 4°. Cylinders' horizontal size was 4° when located midway through the half-cycle. In side-masking conditions, the cylinders' left and right borders were occluded by maskers whose horizontal separation matched the minimal lateral extent of the cylinders during rotation. This kept the visible portion of the cylinder constant, rather than expanding or contracting laterally during rotation. Maskers' size, thus, never exceeded 10% of the cylinders' width.

_{max}, was 18 frames, with a vanished dot replaced by a new dot placed at random in the projected view, using the procedure explained in Experiment 1. The structures were viewed through an 8° × 8° square window. The structure's cross section was circular, with its bottom half forming a cylinder and its top half forming a cone. For test stimuli, the radius of the cylinder was 6°, and that of the cone increased from 6° at a linear rate such that the cone's surface within the midsagittal plane was inclined with angle

*γ*relative to the vertical ( Figure 6a). For control stimuli, the only difference was in the top half. Here, both the axis of rotation and the longitudinal axis of the cone were vertical; hence, the bottom and top halves had different axes of rotation. The radius along the vertical axis was variable, increasing or decreasing linearly from 6° so that the cylinder's surface within the midsagittal plane was inclined with angle

*γ*relative to the vertical ( Figure 6a). For both test and control stimuli, all four boundaries were occluded from view by maskers.

*γ*

_{sim}, were obtained using the method of constant stimuli. From

*γ*

_{sim}we can obtain

*β*(see 1, Equation A15), and thus,

*λ*= 1/

*β*. The five

*γ*

_{sim}values were individually selected for each subject based on pilot data to optimize the range of the psychometric function (in the pilot experiments,

*γ*

_{sim}= 0°, ±10°, ±20°, and ±30° were used for all subjects; in the main experiments, we used

*γ*

_{sim}= −15°, −20°, −25, −30°, and −35° for S1 and

*γ*

_{sim}= −5°, −15°,−25, −35°, and −45° for the rest of the subjects). A cumulative normal was fit to the psychometric functions by probit analysis, from which the points of perceived verticality were obtained. This point was defined as the point at which the response rate for bottom-seen-as-near was 50%.

^{2}) at the upper and lower extremes of the line defined by the intersection of the cone's surface and the midsagittal plane. It was this line, at the horizontal midpoint of the cone, that subjects were to judge. Each run consisted of 50 trials obtained by randomly choosing one of the five different stimuli that differed in

*γ*

_{sim}. Each subject totaled at least two runs so as to judge each stimulus at least 20 times. The same procedure was followed for control stimuli (

*γ*

_{sim}= 0°, ±3.5°, and ±7° for S1 and

*γ*

_{sim}= 0°, ±10°, and ±20° for the rest of the subjects).

*λ*≠ 1 for test stimuli, or of

*γ*≠ 0 for control stimuli, we obtained precise estimates of the standard deviation of the 50% thresholds using the bootstrap method described by Foster and Bischof (1997), which allows the use of a normal distribution to compute probabilities. Fifty percent threshold values (i.e., point of perceived verticality) and their standard deviations obtained from the bootstrap method were virtually identical to those obtained using probit.

*Z*

_{0}is the distance between the object and the subject,

*v*is the retinal speed of the given point, and Ω is the angular speed of rotation (for a derivation, see Fernandez et al., 2002). In what follows, it is assumed that the object is distant enough from the subject so that

*Z*

_{0}approximates the distances of all points on the object.

*θ*with respect to the frontoparallel plane. The distance between a given point on the object and an arbitrary reference frontoparallel plane will be (see Figure A1)

*X*′

*,*

*Y*′

*,*

*Z*′) be the coordinates of a given point in a coordinate system in which the axis

*Y*′ coincides with the axis of rotation ( Figure A2). Let us assume that the axis of rotation is inclined by

*θ*relative to the frontoparallel plane, and let (

*X*

*,*

*Y*

*,*

*Z*) be the coordinates in our canonical reference frame in which

*Y*is vertical ( Figure A2)—note that, for simplicity, we use the same labels for axes and the coordinates of a point relative to those axes. Then, a simple rotation of coordinates gives

*Z*

_{0}(and noticing that

*v*

_{ y′}= 0 for rotations about axis

*Y*′), we obtain

*v*

_{ x′}=

*v*( Equation A1) and

*v*

_{ z′}= Ω

*x*′ = Ω

*x*(lower caps refer to angular variables):

*x*, we obtain Equation 4. Substituting from Equations A1, A3, A4, and A7, Equation A2 becomes

*y*=

*Y*

*/**Z*

_{0}is the angular distance between the given point and a horizontal plane passing through the line of sight. Taking differences between any two points using Equation A9 yields Equation 3.

*λ*from Experiments 1 and 2

*P*

_{2}), and the other from the intersection of the line of sight and the axis of rotation (

*P*

_{1}; Figure A3).

*y*= 0; hence, we have

*λ*:

*r*= Δ

*Z*cos

*θ*( Figure A3) for simulated and perceived values of

*θ*, and using

*C*

_{obs}/

*C*

_{sim}=

*r*

_{obs}/

*r*

_{sim}, Equation A12 becomes:

*λ*as a function of optic-flow properties, which is important to guarantee that the task of computing

*λ*is something the visual system can actually carry out. Using Equations A1, A13, 4, and 10, we obtain, after a lengthy calculation,

*r*

_{obs}and

*v*

_{ x}are the perceived distance to the axis of rotation and the horizontal angular speed, respectively, of any point on the object. It is not known how

*r*

_{obs}is obtained from the optic-flow field—as we already mentioned, the heuristics proposed by Caudek and Domini (1998) seem not to be valid for cylinders. Two points deserve to be stressed here. First, to perceive a constant shape independent of the inclination of the axis of rotation, it is necessary and sufficient that

*r*

_{obs}be a function only of

*v*

_{x}. Second, notice that

*λ*is a function not only of def but also of more basic optic-flow properties, such as the gradients (like ∂

_{x}

*v*

_{y}and possibly also ∂

_{x}

*v*

_{x}through

*r*

_{obs}) and angular speeds.

*λ*from Experiment 3

*P*

_{1}and

*P*

_{2}, on the rotating object, located on the same sagittal plane, but having different depths and heights relative to the subject ( Figure A4). The line joining the two points meets with the vertical to form a perceived angle

*γ*given by

*β*= Δ

*v*

_{ x}/(∇

_{ x}

*v*

_{ y}Δ

*y*) is a quantity that depends only on the physical parameters of the stimulus and, thus, can be set by the experimenter.

*β*(and thus

*λ*) at the point of perceived verticality from a psychometric function in which

*β*is the independent variable, as described in the Methods section. Note that the value of

*λ*obtained in this way is independent of

*θ*and

*γ*and does not require their measurement.

*λ*= 1 for the simulated object; thus, Equation A18 is equivalent to

*λ*≠ 1 will result in a depth-order violation for any pair of points satisfying Equation A19.

*γ*

_{sim}will result in a similarly inclined perceived line

*γ*

_{per}. This is easily seen from Equation A16. Equation A16 is valid for either the simulated or the perceived object; in the latter case, we must use

*λ*= 1. First, note that Equation A16 implies that for

*γ*

_{sim}= constant, then

*β*= constant (assuming that

*θ*

_{sim}= constant). Thus, two parallel lines must have the same value of

*β*. Using Equation A16 again, this time for

*γ*

_{per}, we obtain that

*γ*

_{per}= constant, because

*λ*is also a constant (and also assuming that

*θ*

_{obs}= constant).

_{ x}(∂

_{ y}) represent the partial derivative with respect to

*x*(

*y*) and

*v*

_{ x, y}are the horizontal and vertical components of the retinal speed. For the cylinders of Experiment 1, we have ∂

_{ y}

*v*

_{ y}= ∂

_{ y}

*v*

_{ x}= 0; thus, we obtain

*θ*=

*v*

_{ y}= 0; hence, we can compute def exactly (i.e., without using the approximation given by Equation B9) as

*z*of a point on the object to the frontoparallel plane containing the axis of rotation is (from Equation 2)

*Z*

_{0}is the distance of the subject to the axis of rotation, and

*v*is the retinal velocity of the point. Notice that in Equation 2,

*Z*refers to the distance to the subject, and here,

*z*refers to the distance from the frontoparallel plane (positive toward the subject) containing the axis of rotation.

_{per}, is a function of def, then, for a planar surface, Ω

_{per}will be constant across the surface. Thus, the perceived depth difference between any two points on a planar surface can be obtained from Equation B11 as

*v*is the same for both edges.

*P*

_{2}and

*P*

_{3}will be larger than that between

*P*

_{1}and

*P*

_{3}due to the difference in the slants of the surfaces these pair of points belong to. Thus, in their experiment, subjects indicated which of the two points,

*P*

_{1}or

*P*

_{2}, is closer in depth. They found that

*P*

_{2}is seen as closer, but the same prediction is made by our model (see Figure A5).

*z*

_{1}is the depth difference inside the differential plane (from Equation B12) and

*dz*

_{2}is the depth difference between contiguous differential planes (from Equation B13).

*v*and Ω

_{per}are a function of position (

*x*,

*y*) on the frontal plane; thus, we have

*z*as

_{ x}

*z*and ∂

_{ y}

*z*are obtained from Equation B11 assuming

*v*and Ω as functions of position (

*x*,

*y*).

*P*

_{1}to

*P*

_{2}. To have a consistent object, the value of the integral in Equation B18 must be independent of the path. This is also equivalent to stating that the integral of any closed path will give a value of zero. Using Stokes' theorem (Kaplan, 1952), it is easy to show that this happens if and only if ∂

_{y}

*A*= ∂

_{x}

*B*. An easy but lengthy calculation shows that this is indeed the case. Thus, our alternative to the Domini et al. hypothesis results in an internally consistent recovered depth structure. This structure, though, is not affine.

*σ*and

*τ*are the slant and tilt of the surface, respectively. Using Equation B17 and assuming

*σ*and

*τ*as functions of position (

*x*,

*y*), we get, after a lengthy calculation, an equation similar to Equation B18. In this case, however, ∂

_{y}

*A*≠ ∂

_{x}

*B*, which indicates that the surface is internally inconsistent, as expected.

*C*

_{obs}/

*C*

_{sim}should decrease with

*C*

_{sim}, contrary to our results ( Figure 2b). There are a few ways to deal with this discrepancy. In the Domini et al. model, the dependence of perceived slant on def might differ between planes and strongly curved surfaces. Rather than being a decreasing function of def, it could be an increasing one for curved surfaces. In our version of the model, perceived angular speed of rotation could be taken as a decreasing function of def, rather than an increasing one. This change could be made without fundamentally altering other model predictions.