September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
New enhancements to the DeepGaze models for a better understanding of human scanpaths
Author Affiliations & Notes
  • Matthias Kümmerer
    University of Tübingen
  • Akis Linardos
    University of Barcelona
  • Matthias Bethge
    University of Tübingen
  • Footnotes
    Acknowledgements  This work was supported by the German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039A and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation): Germany’s Excellence Strategy – EXC 2064/1 – 390727645 and SFB 1233
Journal of Vision September 2021, Vol.21, 2568. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Matthias Kümmerer, Akis Linardos, Matthias Bethge; New enhancements to the DeepGaze models for a better understanding of human scanpaths. Journal of Vision 2021;21(9):2568.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

The family of DeepGaze models comprises deep learning based computational models of freeviewing overt attention. DeepGaze II predicts freeviewing fixation locations (Kümmerer et al, ICCV 2017) and DeepGaze III (Kümmerer at al, CCN 2019) predicts freeviewing sequences of fixations. The models encode image information using deep features from pretrained deep neural networks to compute a spatial saliency map, which, in case of DeepGaze III, is then combined with information about the scanpath history to predict the next fixation. Both models have set the state of the art in their respective tasks in the last years. Here, we improve the performance of both models substantially. We replace the backbone deep neural network VGG-19 with better performing networks such as DenseNet. We also improve the architecture of the model and the training procedure. This results in a substantial performance improvement for both DeepGaze II and DeepGaze III and sets a new state of the art for freeviewing fixation prediction and freeviewing scanpath prediction across all commonly used metrics. We further use the improved DeepGaze III model to better understand human scanpaths. For example, we quantify the effects of scene content and scanpath history on human scanpaths. We find that, on the MIT1003 dataset, scene content has a substantially larger effect on fixation selection than scanpath history and that there are only very subtle but measurable interactions between scene content and scanpath history that go beyond a scalar saliency measure. Furthermore, we are able to disentangle the central fixation bias into contributions that are driven by image content, by the initial central fixation, and by a remaining effect that cannot be explained from these two sources. Taken together, the improved DeepGaze models allow us to analyze human scanpaths in ways that are not possible without high-performing deep learning models.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.