Given that the observed central fixation bias cannot be explained in terms of motor biases in the oculomotor system, I will now consider whether the central bias in fixation behavior arises from the distribution of features in the images.
In order to address the question of whether the central bias in fixation behavior arises as a result of central biases in feature distributions within scenes, scenes in which image features were biased toward the center of the image (
Figure 4A,
N = 63 images) were compared to those in which image features were biased toward the periphery of the images (
Figure 4B,
N = 57 images). It is clear that when observers either freely viewed scenes or searched for a luminance target within scenes, there was a central bias in fixation behavior, which occurred both when the features in the image were more prevalent in the center of the scenes being viewed, and when the features in the image were more prevalent at the margins of the scene. This figure clearly demonstrates that the central fixation tendency observed in many scene viewing experiments is not simply a result of central biases in the features present in the scenes. By taking horizontal and vertical cross sections through the distributions (
Figures 4C and
4D), it is clear how little difference the distribution of image features made to the distribution of fixations. There was some evidence of a decrease in the magnitude of the central fixation bias for scenes with peripheral biases in their feature distributions (
Figures 4C and
4D), but the distributions overlapped considerably.
In order to assess whether the task or bias in image features influenced the overall spatial extent of the observers' fixations in the scenes, the variances of fixation locations (expressed in terms of distance from screen center for each fixation) were compared (for a similar approach to assessing the spatial extent of fixation distributions, see Crundall & Underwood,
1998). A 2 (task) × 2 (central/peripheral feature bias) mixed design ANOVA showed no main effects of task or feature bias upon the variance of fixation locations. Thus, whether the image features were biased toward the center or periphery of the images did not influence the overall distribution of fixations made by viewers; nor did the task of the observer. However, there was a significant interaction between task and feature bias,
F(1, 50) = 32.26,
p < .001. Bonferroni-corrected post hoc
t-tests showed that when freely viewing the scenes, fixation location variance was higher for scenes with peripheral image feature biases than for scenes with central feature biases (
p = .012). Conversely, when searching the scenes, variance in fixation location was higher for scenes with centrally biased features than for scenes with peripherally biased features (
p < .001). Thus, while having more prevalent image features in the periphery of the scenes did slightly increase the spatial variance of the fixation distributions in the free viewing task, it had the opposite effect for the search task.
While the above comparison of images with central and peripheral biases in their image feature distributions clearly suggests that the central fixation bias prevails even in the absence of a central bias in image features, the fact that there was some suggestion of a change in the magnitude of the central fixation bias under these circumstances warrants further consideration. Any shift in the fixation distribution that was contingent upon the distribution of image features may be easier to see if alternative categorizations of the images in terms of their feature distributions are employed. As a result, fixation behavior was compared in scenes (1) with image features biased toward the left of the scene or the right of the scene and (2) with image features biased toward the top or bottom of the scene.
Figure 5 shows fixation behavior when viewing scenes with features biased to either the left (
Figure 5A,
N = 57 images) or right (
Figure 5B,
N = 63 images) of the scenes.
The central fixation bias was clear irrespective of the image feature distribution, and the fixation distributions when looking at scenes with features predominantly on the left were very similar to those when viewing scenes with features predominantly on the right of the image. In order to assess whether there was any shift in the fixation locations in the direction of the image feature biases in the scenes, the mean horizontal locations for all fixations that lay within 5° of the vertical midline of the screen were compared. A 2 (task) × 2 (left/right feature bias) mixed design ANOVA showed a significant interaction between task and feature bias,
F(1, 50) = 6.70,
p = .013. Bonferroni-corrected post hoc
t-tests showed that in the search task, the distribution of image features influenced the horizontal mean fixation location (
p < .001), and this was in the same direction as the image feature biases (
Figure 5C). However, this trend was not evident in the free viewing data: There was no significant difference in the mean horizontal location of fixations for scenes with features biased toward the left or right (
p = .964;
Figure 5C).
This trend toward a greater correlation between the image features and the fixation location distribution in the search task than when freely viewing the images, can be interpreted in at least two ways. First, it could be argued that image features played a more prominent role in selecting where to fixate in the search task than they did when freely viewing the same scenes. Such an account of fixation behavior may not be surprising given that the search task required observers to locate a target defined only in terms of its luminance: As such this may promote the selection of low-level image features in this task. Indeed, any purely low-level account of fixation selection would predict this result. For example, the weighted salience framework (Itti & Koch,
2000; Peters, Iyer, Itti, & Koch,
2005) would predict a greater influence of a particular feature on viewing when the target is defined by that feature.
Alternatively, it may be that the luminance target is hardest to locate when superimposed on cluttered regions of the scene. These regions of scenes would therefore be expected to require the most scrutiny by the observer and so attract a large amount of fixation. Since cluttered regions of scenes are likely to have the highest image feature content, fixation distributions would be shifted in the direction of feature biases, without any causality of the image features.
However, whatever the explanation of the shift in fixation distributions in the search task, it should be noted that the magnitude of the shift in the fixation distribution was very small.
Figure 6 shows fixation behavior when viewing scenes where the image features were biased either toward the top of the image (
Figure 6A,
N = 66 images) or toward the bottom of the image (
Figure 6B,
N = 54 images).
Once again the central fixation bias was evident irrespective of the image feature bias or task. A 2 (task) × 2 (top/bottom feature bias) mixed design ANOVA was run to assess whether the vertical distribution of image features influenced the vertical mean of fixation locations. For each observer, the mean vertical fixation location was calculated for all fixations within 5° of the horizontal midline of the screen. There was a significant interaction between task and feature bias,
F(1, 50) = 19.40,
p < .001. Bonferroni-corrected post hoc
t-tests showed that in the free viewing condition, the vertical distribution of fixations did not differ according to the distribution of image features (
p = .159;
Figure 6D). However, in the search task, the vertical mean of the distribution of fixation locations did differ according to the image feature bias (
p < .001): There was a shift of the distribution of fixations toward the top of the scene, when image features were more prevalent in the upper half of the image (
Figure 6D). This shift did not appear to be mirrored by a downward shift in the distribution of fixations when image features were biased toward the lower half of the scene: Here the distribution remained centered on the vertical midline of the image.
As was found when the images were divided according to the horizontal distribution of image features, the vertical distribution of features appeared to influence fixation behavior more when observers were searching for a luminance target than when they were freely viewing the scenes. Once again, however, the magnitude of the effect of image feature distributions upon fixation distributions was small.
By dividing the images up according to the distribution of image features, it is clear that the bias in the image feature distributions had very little impact upon fixation behavior. A strong central tendency in fixation behavior was seen even when image features were biased toward the periphery of images. This anti-correlation between image features and fixations is compelling evidence against the suggestion that central biases in the distribution of image features in scenes underlie the central fixation bias. There are two important implications of the lack of strong correlation between the distribution of image features and the distribution of fixations made by human observers.
First it is clear that image features play a relatively minor role in determining the overall distribution in fixation locations. Clearly, the majority of the observed oculomotor behavior resulted from factors other than the location of low-level image features in the scenes. As such, there was no evidence to support the notion that human fixation behavior was particularly closely correlated with image features in the scenes. The inadequacy of purely low-level accounts of fixation selection in explaining human eye movement behavior is becoming increasingly evident in the literature. The recent development of a framework in which feature selection is modulated by contextual priors generated from scene gist information (Torralba et al.,
2006), demonstrated that a purely feature-based model was poor at accounting for fixation selection. However, a model combining feature selection and contextual information provided a much better account of where observers fixated. Even the popular weighted salience account of selection, in which the top down modulation can be manifest in terms of selectively weighting the feature maps (Itti & Koch,
2000; Peters et al.,
2005), fails to account for certain aspects of human performance (Vincent, Troscianko, & Gilchrist,
2007): For example, it predicts that all feature and conjunction searches should be maximally efficient, yet human observers are not. Clearly, frameworks that are based solely on image features struggle to account for human fixation behavior.
Second, any small influence of the distribution of features upon oculomotor behavior was task-dependent. When images were split according to the horizontal or vertical bias in the image features, there was some evidence of a shift of the distribution of fixations in the direction of the bias in the image features, but this was only the case when participants were searching for a luminance target in the scenes. When freely viewing images, any vertical or horizontal bias in the distribution of image features had no substantial influence upon fixation behavior. The task dependence of this influence further underlines the suggestion that there is not a fundamental causal link between the image features in the scenes and the selection of locations for fixation by the observers. Such task dependence of fixation selection is consistent with studies of vision under natural settings (e.g., Hayhoe et al.,
2003; Land & Hayhoe,
2001), with previous studies of the correlation between features and fixation in free viewing and search tasks (Underwood & Foulsham,
2006; Underwood et al.,
2006), and with recent modifications to feature-based models of fixation selection in order to include higher level factors (Torralba et al.,
2006).
Given that the data argue against the role of oculomotor biases or image features in giving rise to the observed fixation behavior, it would appear that the center of the scene must offer some strategic advantage to the observer that favors fixating this location.
First, the center of the screen may be an optimal location for extracting information from scenes. Recent accounts have suggested that eye movements may target regions that are maximally informative to the viewer (Najemnik & Geisler,
2005; Raj, Geisler, Frazor, & Bovik,
2005) or that reduce the uncertainty of the object or scene being inspected (Renninger, Vergheese, & Coughlan,
2007). It may therefore be that when viewing complex natural scenes, the center of the screen serves as a highly informative location or a location that reduces uncertainty more effectively than other locations in the scene.
Second, it may be that the center of the screen offers no information processing benefit but is an optimal location from which to subsequently explore the scene. In this way, the tendency to look to the center of images may simply be an orienting response and may be independent of the image features in the scene, the observer's high-level task goals, or the informativeness of the central portion of the scene. The existence of such localizing or orienting responses has been suggested by Renninger and colleagues (
2007): When viewing object silhouettes, the initial saccade was made reliably to the center of the objects, but subsequent fixations were distributed around the margins of the silhouetted objects. Renninger et al. suggested that while later fixations served to maximize global information or reduce local uncertainty, the initial saccade to the center of the object could not be explained in this way. Instead Renninger and colleagues argued that these initial saccades were a localizing response from which subsequent exploration of the object ensued.
Third, it may be that the central bias is not a bias toward the center of the scene
per se, rather that it is a bias toward centering the eyeball within its orbit. A bias toward centering the eyeball in its orbit has been demonstrated in previous research (Fuller,
1996; Zambarbieri, Beltrami, & Versino,
1995). This bias is reflected in a shorter latency of saccades that bring the eyeball toward the center of the orbit than saccades that take the eyeball from one eccentric location to another. Paré and Munoz (
2001) argued that this centering bias offers an optimal visual strategy for the observer: This is the optimal orbital position from which to make eye movements to explore the visual surroundings. Since observers in the present study were seated facing the center of the monitor on which scenes were displayed, and head stability was not recorded, it is not possible to dissociate any bias toward the center of the screen from any re-centering bias that favors bringing the eyeball back to the center of its orbit. However, in a previous study of the central fixation bias in reading, it was found that it was the screen center rather than the straight-ahead position (hence the orbital center) that produced the observed central fixation tendencies (Vitu, Kapoula, Lancelin, & Lavigne,
2004): When the screen was displaced from the straight-ahead position of the observer, fixations remained biased toward the center of the screen, not the center of the orbit.
While it is not easy to discriminate between the above possible interpretations of the observed central bias in fixation behavior, one possible way to distinguish these accounts is to consider whether the central fixation bias changes over the course of viewing the scene for several seconds. If the bias arises from an initial orienting to select an optimal location from which to subsequently explore the scenes, the central bias should only be seen in the first (or possibly first few) fixations on the scene; subsequent fixations should not require this centering in the scene. A similar argument can be made if the scene center offers maximal information or uncertainty reduction: The initial benefit for fixating this location ought to promote a strong central bias at the start of viewing but not later on in viewing. However, given such an information theoretic account of the optimality of the screen center, it could be argued that the central bias ought to be less heavily restricted to the first saccade than if the tendency to look to the middle of the screen is a simple orienting response: It may be that the center serves to provide added information later in viewing as well as at the start of viewing. Finally, if the central fixation bias arises from the tendency that observers display to re-center the eyeball in the socket, this central fixation bias should be seen throughout viewing: There should be frequent re-centering movements after exploring peripheral locations in the scene.
Figure 7 shows the distributions of fixations as a function of the ordinal fixation number in each sequence when freely viewing or searching the natural images, for scenes with either a central or peripheral bias in their image feature distributions.
In the free viewing data (
Figure 7 left), the central fixation bias was present throughout the first 12 fixations but diminished in magnitude as viewing progressed. Whether the images being viewed had central or peripheral biases in the distribution of image features seemed to make little difference to the distribution of locations for each ordinal fixation in the sequence for the free viewing condition. A very different pattern of results was found when observers were searching for a luminance target (
Figure 7 right). The observers showed a strong initial centering response, moving toward the center of the image on their first fixation. Thereafter, differences began to emerge in the fixation distributions that were contingent on the image feature distributions. The kernel density estimates of the fixation distributions suggest that when image features were biased toward the center of the scene, fixations remained mainly clustered around the center of the image. Conversely, when image feature distributions were biased to the periphery of the scenes, fixations tended to become more prevalent in the peripheral locations in the images.
Once again, alternative divisions of the images on the basis of the distribution of features were explored.
Figure 8 shows the distributions of fixations when viewing scenes with image features biased either to the left or the right of center.
Figure 9 shows the distributions of fixations when viewing scenes with image features biased either toward the top or the bottom of the scene. In all cases, there was little correlation between the distributions of features and fixations made by observers when freely viewing the scenes; although there is some suggestion of a downward shift in the distribution for images with features biased toward the lower half of the scene for fixations 2 to 4. When searching the images for a luminance target, there was a clearer association between the distribution of locations for each fixation in the sequence and the distribution of image features. However, this association was most evident in the first few fixations after the initial orienting to the screen center (mainly in fixations 2 to 4).
Taken together, the data suggest that both task and the distribution of image features interacted to exert an influence on fixation locations as viewing progressed. In all cases, there was a strong initial centering response: When the scene appeared, irrespective of the task or the distribution of image features, the initial response of the observer was to move their eyes to the middle of the scene. This task- and image feature-independent initial response implies an initial orienting response when faced with a new visual scene, as has been suggested occurs when isolated objects are presented (Renninger et al.,
2007). Looking at the center of the screen may also be advantageous for rapidly extracting the gist of the scene at the start of viewing: The gist of a scene is extracted very rapidly from images (e.g., Biederman,
1981; Intraub,
1980,
1981). Within Torralba et al.'s (
2006) contextual guidance model of fixation selection, the contextual priors are constructed by extracting global scene features within the first few hundred milliseconds of viewing. It may be that the optimal location for extracting this global information is the screen center and as such this initial orienting response serves the construction of contextual priors to aid subsequent oculomotor exploration of the scene. Alternatively, the center of the screen may simply be a good place to begin further exploration of the scene.
After initially orienting to the center of the screen, the distribution of subsequent fixations depended upon the task of the observer. When freely viewing scenes, image feature distributions had no influence on fixation distributions and a central tendency persisted throughout viewing. As such, there was no strong evidence for a causal link between image feature distributions and fixation distributions. The persistence of a central tendency, but of a lower magnitude than the original orienting response, suggests either that the screen center maintained a privileged place in viewing, or that the eyeball was being continually re-centered in its orbit when there was no task to override this re-centering tendency (e.g., Paré & Munoz,
2001). From the present data, these two possibilities cannot be dissociated.
Image feature distributions did influence fixation distributions in the search task: From the second fixation, fixation distributions tended toward the distribution of image features. Thus, when the observers' task was to search for a target defined only in terms of luminance, the distribution of fixations showed a stronger correlation with the distribution of features. The association between distributions of features and fixations was clearest early in viewing, for fixations 2 to 4. A stronger association between features and fixations early in viewing than later in viewing has been reported previously (e.g., Carmi & Itti,
2006; Parkhurst et al.,
2002); this is in contrast to Tatler et al.'s (
2005) suggestion that the strength of association between features and fixations does not vary over the course of viewing a scene. The present data suggest that whether or not features are more strongly correlated with early fixations than later fixations may depend upon the observers' task. However, as stated earlier the causal factors behind this correlation cannot be determined: It may be that image features are more involved in fixation selection in the search task, or it may be that other factors such as the difficulty of locating the target against cluttered regions of the scene result in the observed correlation.
In the search task, the central fixation tendency that persisted throughout viewing in the free viewing condition, rapidly dissipated: From the third fixation, there was little evidence for a central fixation tendency in the observers. As such, this suggest that the center offers no benefit to the viewer in completing their search task, and also that there was no pronounced re-centering of the eye in its orbit during this task.