Abstract
Possible factors guiding eye-movements during the spontaneous exploration of natural scenes are currently a matter of a heated debate. Two main candidates have been put forward: low-level image features constituting bottom-up saliency and high-level object representations acting in a top-down manner. Saliency models are successful in predicting human eye-movements, a finding that has been argued to indicate oculomotor guidance by low-level features. Alternatively, these findings might result from the fact that low-level features and object locations are confounded in natural scenes. Another difficulty in resolving this debate is that the contributions of both factors might change over time: bottom-up guidance might prevail initially, with top-down factors taking over later. The current study contributes to this debate with a novel approach. We used ambiguous, two-tone images as stimuli. These are derived from photographs of natural scenes, the templates. On first viewing, two-tone images appear to consist of meaningless patches. Once an observer has acquired prior object-knowledge relevant to image content by viewing the templates, however, the visual system binds a two-tone image into a coherent percept of a scene. In Experiment 1, we collected eye-gaze data while observers free-viewed template photographs (Template condition) and when they saw two-tone images before (Unresolved) and after (Resolved) providing prior object-knowledge. In Experiment 2 we recorded first fixations after two experimentally controlled saccade-planning times in the same three conditions. Despite the fact that low-level features of two-tone images are identical in the Unresolved and Resolved conditions, observers' eye-gaze patterns in both experiments are more similar between Template and Resolved conditions than between Template and Unresolved conditions. The results shows that with task and stimulus properties kept constant, object representations override the influence of low-level features on oculomotor control already very early on. Therefore, acquiring object representations significantly alters where observers look when viewing natural scenes.
Meeting abstract presented at VSS 2018