September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Convolutional neural networks best predict representational dissimilarity in scene-selective cortex: comparing computational, object and functional models
Author Affiliations
  • Iris Groen
    Laboratory of Brain and Cognition, National Institutes of Mental Health
  • Michelle Greene
    Department of Computer Science, Stanford University
  • Christopher Baldassano
    Princeton Neuroscience Institute, Princeton University
  • Li Fei-Fei
    Department of Computer Science, Stanford University
  • Diane Beck
    Department of Psychology and Beckman Institute, University of Illinois at Urbana-Champaign
  • Christopher Baker
    Laboratory of Brain and Cognition, National Institutes of Mental Health
Journal of Vision August 2017, Vol.17, 1088. doi:https://doi.org/10.1167/17.10.1088
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Iris Groen, Michelle Greene, Christopher Baldassano, Li Fei-Fei, Diane Beck, Christopher Baker; Convolutional neural networks best predict representational dissimilarity in scene-selective cortex: comparing computational, object and functional models. Journal of Vision 2017;17(10):1088. https://doi.org/10.1167/17.10.1088.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Complex scene perception is characterized by the activation of scene-selective regions PPA, OPA and MPA/RSC. So far, these regions have been mostly interpreted as representing visual characteristics of scenes, such as its constituent objects ("an oven"), spatial layout ("a closed space"), or surface textures ("wood and granite"). Recent behavioral evidence, however, suggests that the functions afforded by a scene ("Could I prepare food here?") play a central role in how scenes are understood (Greene et al., 2016). Here, we used a model-based approach to study how the brain represents scene functions. Healthy volunteers (n=20) viewed exemplars from 30 scene categories in an ultra-high-field 7T MRI scanner. Stimuli were carefully selected from a larger set of scenes characterized in terms of their visual properties (derived computationally using a convolutional neural network, CNN), object occurrence, and scene function (derived using separate behavioral experiments), such that each model predicted a maximally different pattern of brain responses. Variation partitioning on multi-voxel response patterns showed that the CNN model best predicted responses in scene-selective regions, with limited additional contribution from the other models. Representations in scene-selective regions correlated best with higher CNN layers; however, responses in PPA and OPA, but not MPA/RSC, also correlated with lower layers. A whole-brain analysis showed that the CNN model contribution was restricted to scene-selective cortex, while the functional model selectively predicted responses in a posterior left-lateralized region associated with action representation. These results show that (high-level) visual properties predict scene-selective regions better than functional properties. However, understanding scene functions may engage other regions than those identified based on scene-selectivity. Further research is needed to determine whether scene functions are better captured by regions outside the scene network or perhaps are better thought of as semantic affordances mediated by visual representations in the higher layers of the CNN.

Meeting abstract presented at VSS 2017

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×