August 2016
Volume 16, Issue 12
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2016
How temporal context predicts eye gaze for dynamic stimuli
Author Affiliations
  • Cameron Ellis
    Department of Psychology, Princeton University
  • Patrick Harding
    Goethe-Universität Frankfurt
  • Judith Fan
    Department of Psychology, Princeton University
  • Nicholas Turk-Browne
    Department of Psychology, Princeton University
Journal of Vision September 2016, Vol.16, 328. doi:https://doi.org/10.1167/16.12.328
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Cameron Ellis, Patrick Harding, Judith Fan, Nicholas Turk-Browne; How temporal context predicts eye gaze for dynamic stimuli . Journal of Vision 2016;16(12):328. https://doi.org/10.1167/16.12.328.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

What determines where we look during natural visual experience? Computer vision algorithms successfully account for gaze behavior in static scenes, but may be insufficient for modeling visual selection in dynamic scenes. This will likely require consideration of how image features evolve over time. For instance, where we attend might be constrained by where we have attended previously, by how long an object has been visible, and by expectations about when and where something will appear. To start bridging this gap, we adapted an algorithm originally developed to extract regions of high visual interest from static images (Harding & Robertson, 2013, Cogn Comput) to additionally capture how temporal information guides visual selection in dynamic scenes. Eye gaze was monitored in a group of 24 observers while they watched a series of short video clips depicting real-world indoor and outdoor scenes (Li et al., 2011, IMAVS). To ensure engagement, observers were occasionally prompted to describe what happened in the previous clip. Static visual-interest maps, generated by applying the original algorithm to each video frame independently, reliably predicted the distribution of eye fixations. However, we were able to influence model performance by adjusting two new parameters that incorporated information about temporal context when creating visual-interest maps: 'history', which placed additional weight on regions with high visual interest in preceding frames; and 'prediction', which placed additional weight on regions with high visual interest in subsequent frames. Further analyses examine whether the history and prediction of actual eye fixations provide additional explanatory power, how long of a temporal window in the past and future is most informative, and how different time samples should be weighted. The ultimate goal is to develop a refined model that better accounts for natural visual behavior and to understand how temporal context biases the allocation of attention.

Meeting abstract presented at VSS 2016

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×