Previous work has shown that humans continuously use visual feedback of the hand to control goal-directed movements online. In most studies, visual error signals were predominantly in the image plane and, thus, were available in an observer's retinal image. We investigate how humans use visual feedback about finger depth provided by binocular and monocular depth cues to control pointing movements. When binocularly viewing a scene in which the hand movement was made in free space, subjects were about 60 ms slower in responding to perturbations in depth than in the image plane. When monocularly viewing a scene designed to maximize the available monocular cues to finger depth (motion, changing size, and cast shadows), subjects showed no response to perturbations in depth. Thus, binocular cues from the finger are critical to effective online control of hand movements in depth. An optimal feedback controller that takes into account the low peripheral stereoacuity and inherent ambiguity in cast shadows can explain the difference in response time in the binocular conditions and lack of response in monocular conditions.

*z*-axis was aligned with the line of sight and included rotations in the

*X*–

*Z*plane (in-depth rotation perturbations) and rotations in the

*X*–

*Y*plane (image plane rotations). Rotation perturbations caused positions shifts that started at 1 cm when the finger emerged from the occluder and decreased over time, vanishing at the target position (Figure 4c). The rotation perturbations keep the motion of the virtual finger

*relative*to the target unchanged and corrective responses actually decrease endpoint accuracy.

*frontal*plane, whose normal was the vector from the cyclopean eye to the center of the reflected screen and the horizontal axis was parallel to the vector from the left eye to the right eye. The position of the target ball was randomly chosen from a patch of the sphere centered at the starting ball with a radius of 30 cm. The patch was centered on a point in the frontal plane 28.2 cm to the left of the starting position and 10.3 cm above the midline of the display. The patch spanned ±7.5 degrees in azimuth and elevation around the center point, translating into a range in the image plane and in depth of approximately ±4 cm.

^{2}or higher. These were typically trials in which subjects rushed to the target and searched around the target before the trial time expired.

*t*= 0 be the time (in Optotrak frames) that the virtual finger emerged out of the occluder and the finger position at time

*t*be

*x*

_{ t }; thus, we have

*a*

_{ i }(

*i*= 1…

*n*) are the coefficients of the AR model and

*u*

_{ t }is a constant term. The residual

*ɛ*

_{ t }has zero mean by construction. We fitted the AR model from baseline trials and computed the raw perturbation influence function

*w*

_{ t }of each perturbation type from

*p*

_{ t }is the amount of perturbation (set to +1 or −1). The raw influence function and the residual noise were smoothed by a causal exponential filter,

*f*(

*t*) =

*e*

^{ λt }/

*λ, t*< 0, with the time constant

*λ*= 37 ms.

*w*

_{ t }= 0) using a bootstrap procedure in which we fit Equation 2 to resampled data from the no perturbation trials with the perturbation (

*p*

_{ t }) set randomly to ±1 on each trial. The standard deviation of the resulting bootstrapped estimates of the smoothed influence functions provides a measure of the standard deviation of weights expected under the null hypothesis that subjects did not correct—technically, it is the standard deviation of the values of

*w*

_{ t }one would expect if subjects did not begin correcting at a time less than or equal to

*t*. We used as our measure of reaction time the first time at which a smoothed perturbation function deviated from 0 by one standard error and remained more than one standard error away from 0 for more than 15 frames (125 ms). Reaction time greater than 40 frames (333 ms) indicated a late correction and we considered the subject as not responding to the perturbation.

*t, c*(

*t*), was given by the time integral of

*w*

_{ t },

*c*(

*t*) =

*w*

_{ i }. The result gives a normed measure of subjects' responses, where a response of 1 corresponds to a 1-cm deviation in movements with perturbations from movements on unperturbed trials. We, therefore, express subjects' responses in centimeters. The magnitude of the response at the end of the movement is computed by summing the weights up to the point that subjects first touched the target.

*n*= 10) AR model in the analysis, though any model above 4th order gave similar results.

*SEM*11 ms) and that to the rotation perturbation was 187 ms (

*SEM*12 ms). All subjects responded to the step perturbation in depth with an average reaction time of 230 ms (

*SEM*12 ms). Only 2 of the 8 subjects showed responses to the rotation perturbation in depth and their average reaction time was 175 ms (

*SEM*13 ms).

*opposite*direction (for which subjects should correct in the same direction as the positional shift), responses to the perturbation were delayed by over 100 ms relative to the responses to simple rotation perturbations. Simulation results showed this pattern to be consistent with the known sensory noise parameters on position and velocity signals; that is, early in the movement, the sensory position signal contributes more strongly to internal estimates of finger position than do velocity signals, so that the motion feedback had to be doubled relative to the position feedback to effectively cancel out the responses.

*SEM*12 ms) and that of the direction perturbation was 221 ms (

*SEM*19 ms; see Figure 8). The average amount of correction to the in-image step perturbations was 0.73 cm, with 95% confidence interval of [0.65, 0.81] and that to the in-image direction perturbations was 0.51 cm, with 95% confidence interval of [0.45, 0.57]. Subjects' corrections to in-image step perturbations were larger than we found in Experiment 1, but this can be explained by the increased movement duration in Experiment 2—777 ms compared to 659 ms in Experiment 1. Subjects showed no significant response to the perturbations in depth (see Figures 8 and 9). Average corrections to the step and rotation perturbations in depth were 0.01 cm, with 95% confidence interval of [−0.068, 0.088], and 0.03 cm, with 95% confidence interval of [−0.048, 0.108], respectively. The 95% confidence bounds on these estimates put the maximal corrections at 0.088 and 0.108; thus, while we cannot conclude definitively that subjects did not correct for the perturbations in depth, we can conclude that any corrections they made were very small in proportion to the size of the perturbations.