August 2016
Volume 16, Issue 12
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2016
Feature representations in networks trained with image sets of animate, inanimate or scenes differ in terms of computational filters but not in location in the brain
Author Affiliations
  • Max Losch
    Brain & Cognition, Psychology, University of Amsterdam
  • Noor Seijdel
    Brain & Cognition, Psychology, University of Amsterdam
  • Kandan Ramakrishnan
    Brain & Cognition, Psychology, University of Amsterdam
  • Cees Snoek
    Institute of Informatics, University of Amsterdam
  • H.Steven Scholte
    Brain & Cognition, Psychology, University of Amsterdam
Journal of Vision September 2016, Vol.16, 175. doi:10.1167/16.12.175
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Max Losch, Noor Seijdel, Kandan Ramakrishnan, Cees Snoek, H.Steven Scholte; Feature representations in networks trained with image sets of animate, inanimate or scenes differ in terms of computational filters but not in location in the brain . Journal of Vision 2016;16(12):175. doi: 10.1167/16.12.175.

      Download citation file:


      © 2017 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

With the rise of convolutional neural networks (CNN's), computer vision models of object recognition have improved dramatically in recent years. Just like the ventral cortex, CNN's show an increase in receptive field size and an increase in neuronal tuning when you move up the neural or computational hierarchy. Here we trained a CNN with an Alexnet type architecture (Krizhevsky et al., 2012) using three different image sets (scenes, animate, inanimate). Next we evaluated the responses in the layers of these networks towards 120 images (images selected from ImageNet (Deng et al., 2009) and Places205 (Zhou et al., 2014) ) using these networks and the original Alexnet. We observe, starting in the third convolutional layer, a differential pattern in the features that have emerged from the networks. The original Alexnet in contrast has a wide range of features spanning all other feature spaces. The features from the place network are a small cluster within this space containing features such as building facades, ground-textures and some human faces. Directly next to this cluster are features from the inanimate trained network that respond to elements such as textile textures, tools and objects. The features from the animate network are much more scattered and respond mainly to faces (humans and other animals). We also evaluated the brain responses towards these images using BOLD-MRI, focusing on the ventral cortex. Using representational similarity analysis we observed reliable correlations of these networks in LO1, LO2, VO1 and VO2 without a spatial differential pattern. These show that specialized trained networks result into specialized features. These features appear to be also used by the brain but within the same general architecture.

Meeting abstract presented at VSS 2016

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×