Abstract
Traditionally, models have focused on the role of visual salience in directing attention during real-world scene processing. However, recent research has suggested that meaningfulness plays a primary role, and specifically that eye gaze is guided by predictions (Henderson, 2016; Henderson & Hayes, 2017). We quantified predictability of search targets using a norming study in which participants were presented with scenes from the SCEGRAM image database (Öhlschläger & Võ, 2017). These scenes did not contain the search target, and participants indicated via mouse click where a given target would likely be located in the scene. Prediction maps were created from the data by applying a gaussian blur (sigma = 1 degree of visual angle). A separate group of participants then searched the scenes for these target objects while their eye movements were tracked. Fixation maps were produced from the eye-tracking data, specifically the location of the first fixation after the initial saccade from image center. Saliency maps werealso created for each image using graph-based visual saliency (Harel, Koch & Perona, 2006). Results indicate that the Prediction maps overlapped significantly with the Fixation maps when the target object was in or near the predicted location (r = 0.33). The Saliency and Fixation maps were more weakly related (r = 0.099). However, this Prediction map advantage disappeared when the target object was in an unusual location (e.g. the cereal bowl was on a chair instead of on the table; Prediction r = 0.095; Saliency r = 0.1). We also report the results of a deep neural network trained to use Predictability maps, saliency maps, and both together to predict eye fixation locations in an image. Together, these data indicate that prediction does guide gaze when peripheral visual information consistent with the prediction is available.
Meeting abstract presented at VSS 2018