September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Towards matching peripheral appearance for arbitrary natural images using deep features
Author Affiliations
  • Thomas Wallis
    Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls Universität Tübingen
    Bernstein Centre for Computational Neuroscience
  • Christina Funke
    Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls Universität Tübingen
    Bernstein Centre for Computational Neuroscience
  • Alexander Ecker
    Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls Universität Tübingen
    Bernstein Centre for Computational Neuroscience
  • Leon Gatys
    Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls Universität Tübingen
    Bernstein Centre for Computational Neuroscience
  • Felix Wichmann
    Neural Information Processing Group, Faculty of Science, Eberhard Karls Universität Tübingen
    Bernstein Centre for Computational Neuroscience
  • Matthias Bethge
    Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls Universität Tübingen
    Bernstein Centre for Computational Neuroscience
Journal of Vision August 2017, Vol.17, 786. doi:10.1167/17.10.786
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Thomas Wallis, Christina Funke, Alexander Ecker, Leon Gatys, Felix Wichmann, Matthias Bethge; Towards matching peripheral appearance for arbitrary natural images using deep features. Journal of Vision 2017;17(10):786. doi: 10.1167/17.10.786.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Due to the structure of the primate visual system, large distortions of the input can go unnoticed in the periphery, and objects can be harder to identify. What encoding underlies these effects? Similarly to Freeman & Simoncelli (Nature Neuroscience, 2011), we developed a model that uses summary statistics averaged over spatial regions that increases with retinal eccentricity (assuming central fixation on an image). We also designed the averaging areas such that changing their scaling progressively discards more information from the original image (i.e. a coarser model produces greater distortions to original image structure than a model with higher resolution). Different from Freeman and Simoncelli, we use the features of a deep neural network trained on object recognition (the VGG-19; Simonyan & Zisserman, ICLR 2015), which is state-of-the art in parametric texture synthesis. We tested whether human observers can discriminate model-generated images from their original source images. Three images subtending 25 deg, two of which were physically identical, were presented for 200 ms each in a three-alternative temporal oddity paradigm. We find a model that, for most original images we tested, produces synthesised images that cannot be told apart from the originals despite producing significant distortions of image structure. However, some images were readily discriminable. Therefore, the model has successfully encoded necessary but not sufficient information to capture appearance in human scene perception. We explore what image features are correlated with discriminability on the image (which images are harder than others?) and pixel (where in an image is the hardest location?) level. While our model does not produce "metamers", it does capture many features important for the appearance of arbitrary natural images in the periphery.

Meeting abstract presented at VSS 2017

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×