September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Comparing response properties of V1 neurons to those of units in the early layers of a convolutional neural net
Author Affiliations
  • Dean Pospisil
    Department of Biological Structure, University of Washington
  • Wyeth Bair
    Department of Biological Structure, University of Washington
Journal of Vision August 2017, Vol.17, 804. doi:10.1167/17.10.804
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Dean Pospisil, Wyeth Bair; Comparing response properties of V1 neurons to those of units in the early layers of a convolutional neural net. Journal of Vision 2017;17(10):804. doi: 10.1167/17.10.804.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep convolutional neural networks (CNNs) trained for object recognition contain units in their later layers that reflect an encoding somewhat similar to that in cortical areas V4 and IT. If it is also true that the earlier stages in these CNNs reflect response properties commonly observed in V1, then CNNs could offer a compelling image-computable model for understanding computations in the ventral stream. To test this, we measured the responses of the CNN known as AlexNet (Krizhevskyet al., 2012) to sinusoidal grating stimuli like those used extensively to characterize V1. We evaluated tuning for orientation, spatial frequency (SF), color, F1/F0 ratio and cross-orientation suppression. In a complementary approach, we directly analyzed the weights (rather than responses) of AlexNet for these properties. We found that the early layers contain an even coverage of orientation and SF, with bandwidths similar to those in V1, and with the second layer containing only complex cells. The 1st layer primarily consists of a group of luminance filters and a smaller group of chromatic filters, with the former tending to prefer higher spatial frequencies. Consistent with cross-orientation suppression, the 2nd layer weights have a sinusoidal relationship with the preferred orientation of their 1st layer inputs. We also tested an untrained network, and found that nearly all of these V1-like properties were absent. We conclude that a CNN can approximate several V1 response properties, and that optimizing for object recognition is sufficient to achieve these properties. Our results support the use of CNNs as a tool to understand how sophisticated cortical representations, for which we do not yet have good biologically plausible image-computable models, may arise from early cortical representations, for which a richer set of models already exists.

Meeting abstract presented at VSS 2017

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×