To analyze how the observed object orientation and task effects unfold over time, we looked into the binned evolution of
x and
y coordinates of eye fixations and hand positions. Eye fixation and index finger binned trajectories, averaged across subjects, are plotted in
Figures 8 and
9.
Considering fixation locations along the horizontal axis, the eyes move towards the left side of the object when it is presented upside down, and this left-side bias grows over the time course of a trial. No obvious, systematic differences between the drink and the hand condition are observable, regardless if the object was presented upright or upside down. On the other hand, when considering the height of the eye fixations along the vertical axis and particularly for the cup and the can, the eye looks lower in the upside down condition when compared with the upright condition and also this difference tends to increase. For the bottle, however, this tendency is reversed, indicating that generally a grasp central on the bottle is preferred. More importantly, though, the contrast between the drink and the hand condition is clearly reflected: When handing over the bottle to the experimenter, the eyes move progressively lower when the bottle is presented upright and progressively higher, when the bottle is presented upside down, compared to when the task is to pretend to drink out of the bottle.
To ascertain to which extent the eyes anticipate the hand and especially the index finger, we finally correlated the binned eye and hand data along the
x and the
y axis. When computing a Pearson product-moment correlation coefficient to assess the relationship between the last fixation (in pixel) and the index finger height (in inches) when the object is reached by the hand on a trial-by-trial basis, a strong positive correlation is found (
r = 0.54,
n = 716,
p < 0.001; regression slope
β = 0.005). For the horizontal dimension a weaker but still significant positive correlation was found (
r = 0.36,
n = 716,
p < 0.001; regression slope
β = 0.014). These correlations are shown in
Figure 11 in the
Appendix. Two trials (one without video and one without fixations prior to the grasp) were discarded.
To analyze how the correlation changed over time, we furthermore correlated each of the 10 eye bins with each of the 10 hand bins, both for the horizontal and vertical dimension. The correlation matrices, along with their significance levels, are shown in
Figure 10. Higher correlation values are concentrated in the lower part of the matrices, indicating that the eyes indeed anticipate the hand, seeing that the eye fixations correlate with later hand positions from very early on.
For the horizontal dimension, the first highly significant correlation is reached in the third eye bin, anticipating the sixth and seventh hand bin (both with values r = 0.28, n = 238, p < 0.001 and regression slope β = 0.21 and β = 0.27, respectively). As a bin covers a time interval of 133 ms on average, the anticipation of three to four bins corresponds to a time interval of at least 400 ms. All later eye bins correlated even higher with the sixth and later hand bins, and all yielded p < 0.001 significance values. Note that it may be expected that the eyes actually move on to the subsequent task, possibly fixating the experimenter to hand over the object or elsewhere to prepare drinking. However, these fixations were not analyzed, and, as stated above, eye bins were filled with eye fixation data from earlier bins when no further eye fixations were mapped on the object. In this way the correlations of the last eye bins with respect to the last hand bins do not degrade.
Due to the larger spectrum of hand positions along the vertical axis (the hand is initially positioned 30 cm lower than the object base), eye fixations significantly correlate with the later positions of the index finger from the very first fixations on. The highest correlation was found between the seventh eye bin and the 10th hand bin (r = 0.66, n = 238, p < 0.001; regression slope β = 0.008), also indicating an anticipation of three bins in this case. To further assess the relationship between eye and hand with respect to anticipating the upcoming task, we conducted the same repeated-measures ANOVA analysis as for the grasp height also with respect to fixation height in the seventh eye height bin. The three-way ANOVA showed a main effect of object: Fixation were higher (lower values in pixel) for the bottle, M = 535, than for the can, M = 580, and the cup, M = 618, F(1.52, 28.85) = 26.14, p < 0.001. As to orientation, the upright objects were fixated higher, M = 535, than the upside down object, M = 620, F(1, 19) = 14.44, p = 0.001, as for the grasp case. Interactions were found for task and orientation, F(1, 19) = 20.85, p < 0.001, object and orientation, F(2, 38) = 23.62, p < 0.001, and all three factors, F(2, 38) = 5.89, p = 0.006. Following up with two-way ANOVAs for every object, the bottle presented no main effect of task or orientation but an interaction of the two factors, F(1, 19) = 26.90, p < 0.001. T tests show that the bottle was gazed higher in the drink upright condition than in the hand upright, t(19) = −4.43, p < 0.001, and lower in the drink upside down condition than in the hand upside condition, t(19) = 2.66, p = 0.032. The cup, perhaps due to its small size, presented only an orientation main effect, F(1, 19) = 18.10, p < 0.001, being fixated higher in the upright than in the inverted condition. Finally the can presented a main effect of orientation, F(1, 19) = 20.94, p < 0.001, and an interaction effect, F(1, 19) = 20.22, p < 0.001. The can was fixated higher in the upright than in the inverted condition. T tests show also that the can was fixated higher in the drink upright than in the hand upright condition, t(19) = −4.03, p = 0.002. All these effects are coherent with the effects obtained for the grasp height data.