December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
How many non-linear computations are required for CNNs to account for the response properties of the primary visual cortex (V1)?
Author Affiliations & Notes
  • Hui-Yuan Miao
    Department of Psychology, Vanderbilt University
  • Hojin Jang
    Department of Psychology, Vanderbilt University
  • Frank Tong
    Department of Psychology, Vanderbilt University
    Vanderbilt Vision Research Center
  • Footnotes
    Acknowledgements  Supported by NIH R01EY029278 grant to FT.
Journal of Vision December 2022, Vol.22, 4172. doi:https://doi.org/10.1167/jov.22.14.4172
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hui-Yuan Miao, Hojin Jang, Frank Tong; How many non-linear computations are required for CNNs to account for the response properties of the primary visual cortex (V1)?. Journal of Vision 2022;22(14):4172. https://doi.org/10.1167/jov.22.14.4172.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

While the primary visual cortex (V1) is arguably the best understood visual area, we still don’t fully understand its computational mechanisms. Traditional models propose that V1 neurons behave like Gabor filters, exhibiting selectivity for orientation and spatial frequency, followed by additional non-linear processes such as half-wave rectification, suppression, and/or divisive normalization. Convolutional neural networks (CNNs), which can simulate complex tuning functions, provide an alternate way to fit V1 data without relying on hand-designed filters. A recent study by Cadena et al. (2019) used the layer-wise activity of VGG-19 to predict V1 neuronal responses of monkeys that viewed thousands of natural and synthesized images. Surprisingly, the best V1 predictions were not obtained in the lowest layers, but rather, after multiple convolutional and max-pooling operations, leading the authors to conclude that V1 relies on far more non-linear computations than previously thought. However, a potential concern is that the lower layers of VGG-19 have small convolutional filters, whereas the images used to evaluate VGG-19 performance were comparatively large. Thus, we suspected that the poor performance of the lower layers of VGG-19 may have been driven by input size. To address this issue, we evaluated the performance of AlexNet, which has much larger receptive fields in its lower layers. In contrast to VGG-19, we found that the first convolutional layer of AlexNet best predicted V1 responses. A control analysis revealed that the best-performing layer of VGG-19 shifted systematically to lower layers after the input images were rescaled to a smaller size. We further showed that a modified version of AlexNet could match the predictive performance of VGG-19 after just a few non-linear computations. Overall, our findings demonstrate that the response properties of V1 neurons can be well explained by relatively few non-linear computations while confirming that CNNs outperform traditional V1 Gabor filter models.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×