September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
Assessing the similarity of cortical object and scene representations through cross-validated voxel encoding models
Author Affiliations & Notes
  • Nicholas M. Blauch
    Center for the Neural Basis of Cognition, Carnegie Mellon University
  • Filipe De Avila Belbute Peres
    Computer Science Department, Carnegie Mellon University
  • Juhi Farooqui
    Center for the Neural Basis of Cognition, Carnegie Mellon University
  • Alireza Chaman Zar
    Department of Electrical and Computer Engineering, Carnegie Mellon University
  • David Plaut
    Center for the Neural Basis of Cognition, Carnegie Mellon University
    Department of Psychology, Carnegie Mellon University
  • Marlene Behrmann
    Center for the Neural Basis of Cognition, Carnegie Mellon University
    Department of Psychology, Carnegie Mellon University
Journal of Vision September 2019, Vol.19, 188d. doi:https://doi.org/10.1167/19.10.188d
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nicholas M. Blauch, Filipe De Avila Belbute Peres, Juhi Farooqui, Alireza Chaman Zar, David Plaut, Marlene Behrmann; Assessing the similarity of cortical object and scene representations through cross-validated voxel encoding models. Journal of Vision 2019;19(10):188d. https://doi.org/10.1167/19.10.188d.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Object and scene perception are instantiated in overlapping networks of cortical regions, including three scene-selective areas in parahippocampal, occipital, and medial parietal cortex (PPA, OPA, and MPA), and a lateral occipital cortical area (LOC) selective for intact objects. The exact contributions of these regions to object and scene perception remain unknown. Here, we leverage BOLD5000 (Chang et. al, 2018), a public fMRI dataset containing responses to ~5000 images in ImageNet, COCO, and Scenes databases, to better understand the roles of these regions in visual perception. These databases vary in the degree to which images focus on single objects, a few objects, or whole scenes, respectively. We build voxel encoding models based on features from a deep convolutional neural network (DCNN) and assess the generalization of our encoding models trained and tested on all combinations of ImageNet, COCO, and Scenes databases. As predicted, we find good generalization between models trained and tested on ImageNet and COCO and poor generalization between ImageNet/COCO trained models and Scenes for most DCNN layer/ROI encoding models. Surprisingly, we find generalization from ImageNet/COCO to Scenes only in early visual cortex with encoding models of intermediate DCNN layers. Additionally, LOC and PPA exhibit similarly good generalization between ImageNet and COCO and poor generalization to Scenes. Excluding MPA responses to Scenes, all scene-selective areas generalize well to held-out data in the trained image database, but PPA exhibits the most robust generalization out-of-database between ImageNet and COCO, reflecting a more general perceptual role. Our work reflects a novel application of encoding models in neuroscience in which distinct stimulus sets are used for training and testing in order to test the similarity of representations underlying these stimuli. We plan to further test the effect of pretraining the DCNN on Places365 rather than ImageNet, and to look at image-level predictors of generalization.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×