Abstract
Humans gather high-resolution visual information only in the fovea, therefore they must make eye movements to explore the visual world. Inspired by results in attention research (Treisman 1980), it has been proposed that free-viewing fixations are driven by a spatial priority or “saliency” map. Whether this is the case has been debated for decades in neuroscience and psychology. One hypothesis states that priority values are assigned locally to image locations, independent of saccade history, and are only later combined with saccade history and other constraints to select the next fixation location. A second hypothesis is that there are interactions between saccade history and image content that cannot be summarised by a single value. For example, if after long saccades different content drives the next fixation than after short saccades, then it is impossible to assign a single saliency value to image locations. Here we discriminate between these possibilities in a data-driven manner. We extend the DeepGaze II model (Kümmerer et al., 2017) to a new model of scanpath prediction. First, we extract features from the VGG deep neural network that are used in a small “readout network” to predict one or multiple saliency maps. These saliency maps are then processed in a second readout network together with information on the scanpath history to predict upcoming saccade landing positions. We train the model using human free-viewing scan path data and achieve state-of-the-art performance compared to previous scanpath models. We find that using multiple saliency maps gives no advantage in scanpath prediction compared to a single saliency map. Since the number of saliency maps the network can use imposes strong qualitative constraints on what the model is able to predict, this suggests that for free-viewing a single saliency map may exist that does not depend on either current or previous gaze locations.
Acknowledgement: German Science Foundation (DFG Collaborative Research Centre 1233), German Excellency Initiative (EXC307)