September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Map-CNN: A Convolutional Neural Network with Map-like Organizations
Author Affiliations
  • Chen-Ping Yu
    Psychology, Harvard University
  • Talia Konkle
    Psychology, Harvard University
Journal of Vision August 2017, Vol.17, 809. doi:https://doi.org/10.1167/17.10.809
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Chen-Ping Yu, Talia Konkle; Map-CNN: A Convolutional Neural Network with Map-like Organizations. Journal of Vision 2017;17(10):809. https://doi.org/10.1167/17.10.809.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep convolutional neural networks (CNNs) are currently the best computational models of visual processing. A core operation of these models is convolution: each artificial neuron of a CNN performs a sweep through the entire input image to produce a response profile. In contrast, neurons in the visual cortex have receptive fields, which are tuned to particular features at particular locations, though a common assumption is that a small set of features are replicated in hypercolumns uniformly across all positions in the retinotopic map. Here we examined this assumption using a computational model with map-like early layers. We constructed a map-CNN in which the artificial neurons in the map layer have a spatial organization and receptive field scaling similar to human V1. First, retinotopy was implemented with local convolutions of unshared weights, with neurons organized in a grid-like layout. Second, a retina-like transformation to the input image was applied, such that images are compressed with increasing distance from the center. The combination of these designs naturally captures both cortical magnification of the fovea and the receptive field size scaling with eccentricity. Finally, the network was trained on 1000-way object classification using the ImageNet dataset. We found that the features learned at each position of the visual field were not uniform, violating the convolutional assumption about the features represented across the visual field. Explorations of these tunings show that foveal map units (< 5°) had more gaussian-blob tuning than peripheral map units, and that while edge filters were learned uniformly across the visual field, the orientations of those edge features exhibited substantial positional biases. These results demonstrate that features learned from natural image statistics in order to perform successful object recognition are naturally heterogeneous across the visual field, and make testable predictions for the spatial distribution of feature tuning in retinotopic areas.

Meeting abstract presented at VSS 2017

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×