September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Generating objects in peripheral vision using attention-guided diffusion models
Author Affiliations & Notes
  • Ritik Raina
    Stony Brook University
  • Seoyoung Ahn
    Stony Brook University
  • Gregory Zelinsky
    Stony Brook University
  • Footnotes
    Acknowledgements  This work was supported in part by NSF IIS awards 1763981 and 2123920 to G.Z.
Journal of Vision September 2024, Vol.24, 1306. doi:https://doi.org/10.1167/jov.24.10.1306
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Ritik Raina, Seoyoung Ahn, Gregory Zelinsky; Generating objects in peripheral vision using attention-guided diffusion models. Journal of Vision 2024;24(10):1306. https://doi.org/10.1167/jov.24.10.1306.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Despite the majority of our visual field being blurry in the periphery, with only the central ~2 degrees offering high-resolution inputs, we have no difficulty perceiving and interacting with objects around us. We hypothesize that the human perception of a stable visual world is mediated by an active generation of objects from blurred peripheral vision. Furthermore, we hypothesize that this active peripheral generation is task-dependent, guided by information extracted from fixations, with the goal of constructing a relevant object and scene context for the current task. We test these hypotheses by using latent diffusion models and evaluating the influence of fixated image information on generating objects in the blurred periphery. We ask this question in the context of an object referral task, in which participants hear a spoken description of the search target (e.g., “right white van”). We recorded eye movements from participants (n=220) as they viewed 1,619 images and attempted to localize the referred targets. The model received high-resolution input only from fixated regions, mimicking foveated vision, and generated high-resolution objects in the originally blurred peripheral areas. We found that using foveated-image inputs corresponding to observed behavioral fixations led to the model generating target objects in the periphery with greater fidelity compared to randomly located fixations, as measured by squared pixel difference (Human Fixation SSE = 178.27; Random Fixation SSE = 212.42; averaged over the first 20 fixations). This fixation-driven advantage specifically applied to the reconstruction of task-relevant objects, such as objects of the same referred category, and did not extend to non-targets or background elements. Our findings support the idea that human perception actively generates relevant objects in the blurry periphery as a means of building a stable object context, which is guided by goal-directed attention control mechanisms.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×