Purchase this article with an account.
Max Losch, Noor Seijdel, Kandan Ramakrishnan, Cees Snoek, H.Steven Scholte; Feature representations in networks trained with image sets of animate, inanimate or scenes differ in terms of computational filters but not in location in the brain . Journal of Vision 2016;16(12):175. doi: https://doi.org/10.1167/16.12.175.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
With the rise of convolutional neural networks (CNN's), computer vision models of object recognition have improved dramatically in recent years. Just like the ventral cortex, CNN's show an increase in receptive field size and an increase in neuronal tuning when you move up the neural or computational hierarchy. Here we trained a CNN with an Alexnet type architecture (Krizhevsky et al., 2012) using three different image sets (scenes, animate, inanimate). Next we evaluated the responses in the layers of these networks towards 120 images (images selected from ImageNet (Deng et al., 2009) and Places205 (Zhou et al., 2014) ) using these networks and the original Alexnet. We observe, starting in the third convolutional layer, a differential pattern in the features that have emerged from the networks. The original Alexnet in contrast has a wide range of features spanning all other feature spaces. The features from the place network are a small cluster within this space containing features such as building facades, ground-textures and some human faces. Directly next to this cluster are features from the inanimate trained network that respond to elements such as textile textures, tools and objects. The features from the animate network are much more scattered and respond mainly to faces (humans and other animals). We also evaluated the brain responses towards these images using BOLD-MRI, focusing on the ventral cortex. Using representational similarity analysis we observed reliable correlations of these networks in LO1, LO2, VO1 and VO2 without a spatial differential pattern. These show that specialized trained networks result into specialized features. These features appear to be also used by the brain but within the same general architecture.
Meeting abstract presented at VSS 2016
This PDF is available to Subscribers Only