The image information guiding visual behavior is acquired and maintained in an interplay of gaze shifts and visual short-term memory (VSTM). If storage capacity of VSTM is exhausted, gaze shifts can be used to regain information not currently represented in memory. By varying the separation between relevant image regions, S. Inamdar and M. Pomplun (2003) demonstrated a trade-off between VSTM storage and gaze shifts, which were performed as pure eye movements, that is, without a head movement component. Here we extend this paradigm to larger gaze shifts involving both eye and head movements. We use a comparative visual search paradigm with two relevant image regions and region separation as independent variable. Image regions were defined by two cupboards displaying colored geometrical objects in roughly equal arrangements. Subjects were asked to find differences in the arrangement of the objects in the two cupboards. Cupboard separation was varied between 30° and 120°. Images were presented with two projectors on a 150° × 70° curved screen. Head and eye movements were simultaneously recorded with an ART head tracker and an ASL mobile eye tracker, respectively. In the large separation conditions, the number of gaze shifts between the two cupboards was reduced, while fixation duration increased. Furthermore, the head movement proportions negatively correlated with the number of gaze shifts and positively correlated with fixation duration. We conclude that the visual system uses increased VSTM involvement to avoid gaze movements and in particular movements of the head. Scan path analysis revealed two subject-specific strategies (encode left, compare right, and vice versa), which were consistently used in all separation conditions.

*α*and

*β,*respectively) relative to the point of origin (see Figure 2). Thus, the gaze vector includes both the head-in-space and the eye-in-head vectors. Gaze fixations were defined in the following way: For each time step

*t*

_{ o}, we consider a gliding window of length 120 ms centered at

*t*

_{ o}. Let

*v*

_{min}and

*v*

_{max}denote minimal and maximal gaze velocities obtained within the window. The instant

*t*

_{ o}is classified as belonging to a fixation, if

*v*

_{max}−

*v*

_{min}< 100 deg/s. This procedure is iterated through all time steps. Adjacent instants in time satisfying the condition are combined to fixational events.

*F*(3,33) = 1.416,

*MSE*= 9.84,

*p*= .256.

*F*(3,33) = 33.186,

*MSE*= 4.754,

*p*< .001, eta

_{p}

^{2}= .751 ( Figure 3). The response time increased linearly by about 2 s for each 30° step of cupboard distance.

*α*= 0 and

*β*= 0°). This was ensured by the calibration cross fixation phase before starting an experimental trial. After both cupboards became visible, all investigated subjects shifted their gaze to the upper left part of the left cupboard (cf. Figure 2 and Movie 1). This was followed by oscillating gaze changes between the left and right hemifield including shelf level shifts down to the lowest part of the cupboards. This general search pattern could be observed for all participants.

*r*= .879 (see Figure 5). The clustered distribution of the data points in this figure reflects the four cupboard distance conditions.

*F*(3,33) = 75.341,

*MSE*= 3.714,

*p*< .001, eta

_{p}

^{2}= .873. Subjects performed approximately 10 gaze shifts for the largest cupboard distance condition. This value is roughly half compared to the 30° condition.

*F*(3,33) = 9.331,

*MSE*= 137.01,

*p*< .01, eta

_{p}

^{2}= .459. On average, subjects fixated about 25 ms longer for the most “expensive” condition in this experiment compared to the relatively “inexpensive” 30° cupboard distance condition. Figure 6B1 shows the fixation duration averaged over all trials per distance condition separately for each subject. All subjects performed longer lasting fixations related to the increased distance between both cupboards. Both the fixation duration and the number of inter-hemifield saccades over all subjects tend to saturate for large hemifield separations.

*F*(3,33) = 0.974,

*MSE*= 11.606,

*p*= .417. The same holds for the individual subjects, but with variation across subjects in large ( Figure 6C1). Subjects performed the search task with a minimal fixation number of about 25 and a maximal number of about 65. Still, each subject showed a constant number of fixations over all distance conditions.

*t*tests performed over the correlation coefficients of all 12 subjects revealed significant differences from zero correlation, for details see Figure 8. Correlations of maximum head amplitude with the number of inter-hemifield saccades were negative for 10 of 12 subjects in both separation conditions ( Figure 8, left). Correlations of maximum head amplitude with the fixation duration were positive for 11 of 12 subjects in both separation conditions ( Figure 8, right). These data indicate that for a fixed hemifield separation, larger head proportions correlate with smaller number of inter-hemifield saccades and longer fixation duration.

*F*(11,384) = 52.52,

*MSE*= 46.8,

*p*< .001, eta

_{p}

^{2}= .6. The post hoc analysis produced a significant difference between the right hemifield subject group and the left hemifield subject group (

*p*< .01).

*t*tests revealed significant differences between the fixation duration in the left and in the right hemifield for all cupboard distance conditions except for the one case marked in the Figure 10. The main effect for increased fixation duration with increased cupboard distance level was still visible in all four distance conditions. In addition, the left hemifield subjects' fixation durations were always longer than those performed by the right hemifield subjects.

*n,*needed to solve the task equals half the number of inter-hemifield gaze-shifts. The amount of information processed during each cycle equals

*I*= 1/

*n,*where the total information is arbitrarily set to unity. We now assume that the number of cycles,

*n,*is adjusted to minimize the total cost of processing. One possible measure of costs is the time needed to solve the task as suggested, for example, by Gray and Fu (2004) and Gray et al. (2006). This measure is consistent with the instruction given to our subjects, that is, to solve the task as quickly and reliably as possible. Because the error rate was generally low, it seems that time is indeed the most important factor. With respect to the processing cycle, we need to distinguish two time variables. First, the processing time per cycle,

*T*

_{p}(

*I*), equals the time needed for the encoding and comparison steps; it is assumed to depend on the size

*I*= 1/

*n*of the information chunk, but not on hemifield separation. We choose a power law for

*T*

_{p}(

*I*), that is,

*T*

_{p}(

*I*) =

*b*·

*I*

^{α}with constants

*b,*

*α*. Second, shift duration per cycle,

*T*

_{s}(

*φ*), is assumed to depend on hemifield separation

*φ,*but not on chunk size

*I*. Note that the dependence of

*T*

_{s}on

*φ*is taken simply as an empirical fact as reported in Figure 7. It may result from the varying relative contributions of eye and head movements, both in terms of mechanical properties and in terms of neural planning effort for the compound movement. With the above notations, we can compute the total time needed to carry out the comparison in

*n*steps:

*T*

_{p}. The idea of the model is that the trade-off results from minimizing

*T*

_{tot}with respect to

*n*. Taking the derivative with respect to

*n*and setting the result to zero yields

*c*is a constant depending on

*b*and

*α*. Figure 7 shows the relation of

*n*and

*T*

_{s}from our empirical data together with the theoretical curve given by Equation 3. If

*T*

_{s}is measured in seconds, the fitting parameters are

*c*= 29.5 and

*α*= 1.47.

*T*

_{p}depends on the size of the encoded information chunk? One possible explanation for this is that the encoding of additional information may become more error prone as the amount of information already encoded increases. In this situation, additional fixations within one hemifield might be necessary, resulting in a longer processing time

*T*

_{p}. Indeed, the total number of fixations was found to be constant across all separation conditions (see Figure 6C), indicating that the number of intra-hemifield saccades increases as the number of inter-hemifield gaze shifts decreases. We therefore suggest that working memory involvement increases with hemifield separation and that encoding time increases in a nonlinear way with the amount of encoded information.