December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Scanpath prediction in dynamic real-world scenes based on object-based selection
Author Affiliations & Notes
  • Nicolas Roth
    Technische Universität Berlin
    Exzellenzcluster Science of Intelligence, Technische Universität Berlin
  • Martin Rolfs
    Humboldt-Universität zu Berlin
    Exzellenzcluster Science of Intelligence, Technische Universität Berlin
  • Klaus Obermayer
    Technische Universität Berlin
    Exzellenzcluster Science of Intelligence, Technische Universität Berlin
  • Footnotes
    Acknowledgements  Funded by the German Research Foundation under Germany’s Excellence Strategy – EXC 2002/1 “Science of Intelligence” – project number 390523135.
Journal of Vision December 2022, Vol.22, 4217. doi:https://doi.org/10.1167/jov.22.14.4217
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nicolas Roth, Martin Rolfs, Klaus Obermayer; Scanpath prediction in dynamic real-world scenes based on object-based selection. Journal of Vision 2022;22(14):4217. https://doi.org/10.1167/jov.22.14.4217.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Humans actively shift their gaze when viewing dynamic real-world scenes. While there is a long-standing interest in understanding this behavior, the complexity of natural scenes makes it difficult to analyze experimentally. During free viewing, it has long been thought that the targets of eye movements are selected based on bottom-up saliency, but evidence accumulates that objects play an important role in the selection process. Here, we use a computational scanpath prediction framework to systematically compare predictions of models that incorporate combinations of object and saliency information, to human eye-tracking data. We model saccades as sequential decision processes between potential targets. To investigate the relevance of object-based selection, we compare an object-based model in which saccades target semantic objects, with a location-based model in which saccades target individual pixel values. Target selection in both models depends on potential targets’ eccentricity, the previous scanpath history, and target relevance. Target relevance is implemented either based on the distance to the center (center bias), on saliency based on low-level features, or high-level saliency as predicted by a deep neural network. We optimize each model’s parameters with evolutionary algorithms and fit them to reproduce the saccade amplitude and fixation duration distributions of free-viewing eye-tracking data on videos of the VidCom dataset. We assess model performance with respect to spatial and temporal fixation behavior, including the proportion of fixations exploring the background, as well as detecting, inspecting, and revisiting objects. Human data were best predicted by the object-based model with low-level saliency, followed by the location-based model with high-level saliency and the object-based model combined with a center bias. The location-based model with low-level saliency or center bias mainly explores the background. These results support the view that object-level attentional units play an important role in human exploration behavior, while saliency helps to prioritize between objects.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×