August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Modeling internal state changes in free-viewing and visual search scanpaths with gain control in DeepGaze III
Author Affiliations & Notes
  • Matthias Kümmerer
    University of Tübingen, Tuebingen AI Center
  • Matthias Bethge
    University of Tübingen, Tuebingen AI Center
  • Footnotes
    Acknowledgements  This work was supported by the German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039A and the Deutsche Forschungsgemeinschaft (DFG): Germany's Excellence Strategy - EXC 2064/1 - 390727645 and SFB 1233, Robust Vision: Inference Principles and Neural Mechanisms.
Journal of Vision August 2023, Vol.23, 5334. doi:https://doi.org/10.1167/jov.23.9.5334
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Matthias Kümmerer, Matthias Bethge; Modeling internal state changes in free-viewing and visual search scanpaths with gain control in DeepGaze III. Journal of Vision 2023;23(9):5334. https://doi.org/10.1167/jov.23.9.5334.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The DeepGaze III model currently sets the state-of-the-art in predicting free-viewing human scanpaths on natural images by predicting future fixations from the observed image and recent fixation locations. Inspired by gain control mechanisms in Neuroscience, we introduce gain control layers into the network architecture which can modulate the activity in certain channels of the network depending on additional factors, such as observer biases or search targets. By comparing the prediction performance of the baseline model with the performance of such an extended model in terms of information gain, we can quantify the amount of information that additional factors contribute to fixation placement. Due to the modular DeepGaze III architecture, we can decompose the information gain into different components: (1) a first component affecting only the modulation amplitude of the fixation distribution, (2) a second component modulating which image features are salient, and (3) a third component affecting the scanpath dynamics. Applying this approach, we quantify how much a fixation’s index in a scanpath, subject identity and search targets affect scanpaths in free-viewing and visual search. For free-viewing, we find that fixation index and subject identity contribute to a similar degree to fixation placement. In the case of fixation index, this information is equally split into a part making the fixation density more uniform over time, and a part changing which image features are salient. The contribution of subject identity is mostly due to different subjects preferring different image features. For visual search on the COCO Search18 dataset, the search target increases the explained information by 18% compared to just the presented image, suggesting substantial similarities in fixation behavior across targets. Our work demonstrates how contrast gain control can be used as a very general and sample-efficient mechanism to flexibly modify neural network computation to account for additional factors of interest.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×