September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Carving up the ventral stream with Deep Synthesis
Author Affiliations
  • Anthony Norcia
    Department of Psychology, Stanford University
  • Wesley Meredith
    Department of Psychology, Stanford University
  • Guillaume Reisen
    Neuroscience Graduate Program, Stanford University
  • Daniel Yamins
    Department of Psychology, Stanford University
    Department of Computer Science, Stanford University
Journal of Vision August 2017, Vol.17, 1348. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Anthony Norcia, Wesley Meredith, Guillaume Reisen, Daniel Yamins; Carving up the ventral stream with Deep Synthesis. Journal of Vision 2017;17(10):1348. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

A prominent working hypothesis about object recognition is that an increasingly rich representation of natural images is constructed by a hierarchically organized series of processing steps. An especially efficient means to explore this hypothesis is to invert an image-computable encoding model to synthesize image metamers, pairs of images that are statistically equivalent up to a given level of statistical regularity. This idea has been usefully deployed in early visual areas, using a two-layer hierarchical model to differentiate visual area V1 from V2 (Portilla and Simoncelli, 2000; Freeman and Simoncelli, 2011). Inspired by Gatys et al., (2015), we generalized the synthetic metamer approach to probe higher levels within the ventral stream. Specifically, using a Deep Neural Network that has been shown to predict neural responses in multiple ventral stream areas (Hong et al., 2016), we create a graded series of 5 metamers with increasing fidelity to the high-level statistics of intact natural images. Using these stimuli, we measured Steady-State Visual Evoked Potentials (SSVEPs) to alternations of the original images with corresponding synthesized images drawn from each of the 5 levels of the synthesis stack. Metamerism was operationalized as the magnitude of the first harmonic of the SSVEP, a response component that quantifies the differential response between synthesized and original images. SSVEP amplitude scaled in proportion to the distance between the highest layer included in the synthesis and the original image (n=16 adults). The phase of the SSVEP, a measure of processing delay, also showed progressively delayed responses for images drawn from increasingly higher layers of the synthesis stack, consistent with a temporal processing hierarchy that is based on the statistical complexity of the images. Deep Neural Networks provide tunable generative models of images that can be used to test specific hypotheses about complex levels of processing in object recognition pathways.

Meeting abstract presented at VSS 2017


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.