Abstract
When we speak of fixation selection, we typically state that eye movements are made to salient objects or location in the visual scene. The classic saliency model of Itti & Koch (2000) is well grounded in a biologically plausible framework that captures much of what we know about the low-level visual system (retina to primary visual cortex). As an account of bottom-up fixation selection, it is a reasonable predictor of fixation selection in “task-free” situations. However, as demonstrated in the classic work of Yarbus (1967), task plays a critical role in fixation selection. Instead, we propose that eye movements are made to locations where task-information is most uncertain. From an algorithmic perspective, the inhibition of return to a specific location is implicit, because there is no need to revisit a location once we are certain about its task-relevant properties.
While we track their eye movements, observers must utilize orientation information along a contour to learn a novel shape in a brief amount of time, before using this information in a shape matching task. Predictions from both the saliency and uncertainty theories are compared against human fixations. The area under ROC curves indicate that a “local uncertainty” rule is the best predictor of fixated locations during shape learning. The degree to which an observer follows this strategy (i.e. their fixation selection “efficiency”) portends their overall performance in the shape matching task. Similar results have been found for single and multiple target search tasks, suggesting that uncertainty reduction may be a generalized strategy for collecting task-dependent visual information with eye movements. Furthermore, individuals who can learn efficient fixation placement may perform better on visual tasks.