Purchase this article with an account.
Anthony Norcia, Wesley Meredith, Guillaume Reisen, Daniel Yamins; Carving up the ventral stream with Deep Synthesis. Journal of Vision 2017;17(10):1348. doi: https://doi.org/10.1167/17.10.1348.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
A prominent working hypothesis about object recognition is that an increasingly rich representation of natural images is constructed by a hierarchically organized series of processing steps. An especially efficient means to explore this hypothesis is to invert an image-computable encoding model to synthesize image metamers, pairs of images that are statistically equivalent up to a given level of statistical regularity. This idea has been usefully deployed in early visual areas, using a two-layer hierarchical model to differentiate visual area V1 from V2 (Portilla and Simoncelli, 2000; Freeman and Simoncelli, 2011). Inspired by Gatys et al., (2015), we generalized the synthetic metamer approach to probe higher levels within the ventral stream. Specifically, using a Deep Neural Network that has been shown to predict neural responses in multiple ventral stream areas (Hong et al., 2016), we create a graded series of 5 metamers with increasing fidelity to the high-level statistics of intact natural images. Using these stimuli, we measured Steady-State Visual Evoked Potentials (SSVEPs) to alternations of the original images with corresponding synthesized images drawn from each of the 5 levels of the synthesis stack. Metamerism was operationalized as the magnitude of the first harmonic of the SSVEP, a response component that quantifies the differential response between synthesized and original images. SSVEP amplitude scaled in proportion to the distance between the highest layer included in the synthesis and the original image (n=16 adults). The phase of the SSVEP, a measure of processing delay, also showed progressively delayed responses for images drawn from increasingly higher layers of the synthesis stack, consistent with a temporal processing hierarchy that is based on the statistical complexity of the images. Deep Neural Networks provide tunable generative models of images that can be used to test specific hypotheses about complex levels of processing in object recognition pathways.
Meeting abstract presented at VSS 2017
This PDF is available to Subscribers Only