September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Combining Generative Adversarial Networks (GANs) with behavior and brain recordings to study scene understanding
Author Affiliations
  • Melissa Le-Hoa Võ
    Goethe University Frankfurt
  • Aylin Kallmayer
    Goethe University Frankfurt
Journal of Vision September 2024, Vol.24, 228. doi:https://doi.org/10.1167/jov.24.10.228
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Melissa Le-Hoa Võ, Aylin Kallmayer; Combining Generative Adversarial Networks (GANs) with behavior and brain recordings to study scene understanding. Journal of Vision 2024;24(10):228. https://doi.org/10.1167/jov.24.10.228.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Our visual world is a complex conglomeration of objects that adhere to semantic and syntactic regularities, a.k.a. scene grammar according to which scenes can be decomposed into phrases – i.e, smaller clusters of objects forming conceptual units – which again contain so-called anchor objects. These usually large and stationary objects further anchor predictions regarding the identity and location of most other smaller objects within the same phrase and play a key role in guiding attention and boosting perception during real-world search. They therefore provide an important organizing principle for structuring real-world scenes. Generative adversarial networks (GANs) trained on images of real-world scenes learn the scenes’ latent grammar to then synthesize images that mimic images of real-world scenes increasingly well. Therefore GANs can be used to study the hidden representations underlying object-based perception serving as testbeds to investigate the role that anchor objects play in both the generation and understanding of scenes. We will present some recent work in which we presented participants with real and generated images recording both behavior and brain responses. Modelling behavioral responses from a range of computer vision models we found that mostly high-level visual features and the strength of anchor information predicted human scene understanding of generated scenes. Using EEG to investigate the temporal dynamics of these processes revealed initial processing of anchor information which generalized to subsequent processing of the scene’s authenticity. These new findings imply that anchors pave the way to scene understanding and that models predicting real-world attention and perception should become more object-centric.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×