October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
Analyzing task-specific patterns in human scanpaths
Author Affiliations & Notes
  • Matthias Kümmerer
    University of Tübingen
  • Thomas S.A. Wallis
    University of Tübingen
    Amazon Research Tübingen (this work was done prior to joining Amazon)
  • Matthias Bethge
    University of Tübingen
  • Footnotes
    Acknowledgements  We acknowledge support from the German Federal Ministry of Education and Research (BMBF) through the Tübingen AI Center (FKZ: 01IS18039A) and from the German Science Foundation (DFG): SFB 1233, Robust Vision: Inference Principles and Neural Mechanisms, project number 276693517.
Journal of Vision October 2020, Vol.20, 1191. doi:https://doi.org/10.1167/jov.20.11.1191
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Matthias Kümmerer, Thomas S.A. Wallis, Matthias Bethge; Analyzing task-specific patterns in human scanpaths. Journal of Vision 2020;20(11):1191. https://doi.org/10.1167/jov.20.11.1191.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Humans gather high-resolution visual information only in the fovea, therefore they must make eye movements to explore the visual world. The spatio-temporal fixation patterns (scanpaths) of observers carry information about which aspects of the environment are currently relevant. Most of the recent progress on predicting the spatial and spatio-temporal patterns of human scanpaths has been focused on free-viewing conditions. However, fixations and scanpaths are known to be strongly influenced by the task performed by observers. The purpose of this work is to analyze those influences in a quantitative way. The DeepGaze III model for scanpath prediction (Kümmerer et al, VSS 2017) has been shown to achieve high performance in predicting free-viewing scanpaths. DeepGaze III extracts features from the VGG deep neural network that are used in a readout network to predict a saliency map, which is then processed in a second readout network together with information on the scanpath history to predict upcoming saccade landing positions. Here, we train different task-specific versions of DeepGaze III on human scanpath data of subjects performing different tasks on the same images (freeviewing, objectsearch, saliencysearch; Koehler et al., JoV 2014). Prediction performances show that the models successfully adapt to the task-specific scanpaths. We find and visualize cases where the model predictions differ substantially for the different tasks. The task-specific models can be used to detect the task of a given scanpath via maximum-likelihood classification. We find that while purely spatial task-specific models (finetuned versions of DeepGaze II) perform above-chance (43%) at task recognition, changing to the scanpath-aware DeepGaze III models improves performance further to 45%. This quantifies spatial and temporal contributions to task-specific differences in human scanpaths. In the future, we plan to extend our analysis towards quantifying differences in the way scene content and scanpath history interact in fixation selection in different tasks.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.