Journal of Vision Cover Image for Volume 24, Issue 10
September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Eye Movements during Free Viewing to Maximize Scene Understanding
Author Affiliations
  • Shravan Murlidaran
    University of California Santa Barbara
  • Miguel P Eckstein
    Center for the Neural Basis of Cognition
Journal of Vision September 2024, Vol.24, 1189. doi:https://doi.org/10.1167/jov.24.10.1189
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Shravan Murlidaran, Miguel P Eckstein; Eye Movements during Free Viewing to Maximize Scene Understanding. Journal of Vision 2024;24(10):1189. https://doi.org/10.1167/jov.24.10.1189.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Introduction: The extent to which eye movements during free-viewing of scenes are influenced by low-level saliency (Parkhurst et.al 2002, Harel et al. 2007, Koehler et al. 2014), local semantic meaningfulness (Henderson et al., 2017, Peacock et al., 2019), or other processes is debated. Here, we hypothesize that during free-viewing, humans direct their eyes to regions that maximize scene understanding rather than locally salient or meaningful regions. Methods: For each image (n=36) we created a scene understanding map (SUM) that assesses the contribution of individual objects to observers’ (n=110) scene descriptions (global understanding of the scene) by digitally removing each object from the image and having eighteen raters evaluate the similarity of descriptions to manipulated and original images. We compared the predictions from SUM and other models like saliency (Graph-Based Visual Saliency), DeepGaze, and local meaningfulness to human (n=50 per task) fixations during free-viewing (FV) and scene-description (SD) tasks. Images were presented for 2 seconds while eye position was measured. Results: In both the scene description (SD) task and free viewing (FV) tasks, fixations to the regions most critical to scene understanding (top-highest region in SUM) were significantly higher than those to the top predictions of DeepGaze (pSD=0.0035,pFV=0.0025, with a significant difference starting with the 6th fixation, (pSD=0.013,pFV=0.044)), local meaningfulness (pSD=0.00001, pFV<0.00001, with a significant difference starting with the 4th and 3rd fixation (pSD=0.003, pFV=0.019)) and GBVS saliency(pSD<0.00001, pFV<0.00001, with a significant difference starting with the 4th fixation (pSD=0.037, pFV=0.007)) models. Conclusions: Our findings suggest that during free-viewing, humans do not execute eye movements to low-level saliency or locally meaningful regions but to image regions that maximize the global understanding of the scene.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×