October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
GANalyze: Toward visual definitions of cognitive image properties
Author Affiliations & Notes
  • Lore* Goetschalckx
    MIT (CSAIL)
    KU Leuven
  • Alex* Andonian
    MIT (CSAIL)
  • Aude Oliva
    MIT (CSAIL)
  • Phillip Isola
    MIT (CSAIL)
  • Footnotes
    Acknowledgements  This work was partly funded by NSF award 1532591 in Neural and Cognitive Systems (to A.O), by a fellowship (Grant 1108116N) and a travel grant (Grant V4.085.18N) awarded to Lore Goetschalckx by the Research Foundation - Flanders (FWO).
Journal of Vision October 2020, Vol.20, 297. doi:https://doi.org/10.1167/jov.20.11.297
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Lore* Goetschalckx, Alex* Andonian, Aude Oliva, Phillip Isola; GANalyze: Toward visual definitions of cognitive image properties. Journal of Vision 2020;20(11):297. doi: https://doi.org/10.1167/jov.20.11.297.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

We introduce a framework that uses Generative Adversarial Networks (GANs) to study cognitive image properties (e.g., memorability, aesthetics, valence). Often, we do not have concrete visual definitions of what these properties entail. Starting from input noise, GANs generate a manifold of natural-looking images with fine-grained differences in their visual attributes. By navigating this manifold, we can visualize what it looks like for a particular GAN-image to become more (less) memorable. Specifically, we trained a Transformer module to learn along which direction to move a BigGAN-image’s corresponding noise vector in order to increase (or decrease) its memorability. Memorability was assessed by an off-the-shelf Assessor (MemNet). After training, we generated a test set of 1.5K “seed images”, each with four “clone images”: two modified to be more memorable (one and two “steps” forward along the learned direction) and two to be less memorable (one and two steps backward; examples in Supplemental). The assessed memorability significantly increased when stepping along the learned direction (β = 0.68, p < 0.001), suggesting training was successful. Through a behavioral repeat-detection memory experiment, we verified that our method’s manipulations indeed causally affect human memory performance (β = 1.92, p < 0.001). The seeds and their clones (i.e., "visual definitions") surfaced candidate image properties (e.g., “object size”, “colorfulness”) that may underlie memorability and were previously overlooked. These candidates correlated with the learned memorability direction. We furthermore demonstrate that stepping along a learned “object size” direction indeed increases human memorability, though less strongly (β = 0.11, p < 0.001). This showcases how the individual, causal effects of a candidate can be studied further using the same framework. Finally, we find that by substituting the Assessor, our framework can also provide visual definitions for aesthetics (β = 0.72, p < 0.001) and emotional valence (β = 0.44, p < 0.001).

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×