September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Out-of-Distribution generalization behavior of DNN-based encoding models for the visual cortex
Author Affiliations & Notes
  • Spandan Madan
    Harvard University
    Boston Children's Hospital
  • Mingran Cao
    The Francis Crick Institute
  • Will Xiao
    Harvard University
  • Hanspeter Pfister
    Harvard University
  • Gabriel Kreiman
    Harvard University
    Boston Children's Hospital
  • Footnotes
    Acknowledgements  This work has been partially supported by NSF grant IIS-1901030.
Journal of Vision September 2024, Vol.24, 1148. doi:https://doi.org/10.1167/jov.24.10.1148
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Spandan Madan, Mingran Cao, Will Xiao, Hanspeter Pfister, Gabriel Kreiman; Out-of-Distribution generalization behavior of DNN-based encoding models for the visual cortex. Journal of Vision 2024;24(10):1148. https://doi.org/10.1167/jov.24.10.1148.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep Neural Networks (DNNs) trained for object classification have remarkably similar internal feature representations to neural representations in the primate ventral visual stream. This has led to the widespread use of encoding models of the visual cortex utilizing linear combinations of pre-trained DNN unit activities. However, DNNs struggle with generalization under distribution shifts, particularly when faced with out-of-distribution (OOD) samples. While DNNs excel at interpolating between training data points, they perform poorly when extrapolating beyond the bounds of the training data (e.g., Hasson et al., 2020). We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the primate visual ventral stream. Using a large-scale dataset of neuronal responses from the macaque inferior temporal cortex to over 100,000 images, we simulated the effect of OOD neural activity prediction by dividing the images into multiple training and test sets, holding out subsets of the data to introduce different OOD domain shifts. This includes OOD low-level image features like contrast, hue, and size; OOD high-level features like animate vs inanimate, food vs non-food, different semantic object categories; and OOD K-means clusters in the distributed representations of ResNet features and neural data. For each feature, an OOD test set was constructed by defining a parametric value for that feature, and withholding from training a subset of possible values for testing. Overall, models performed much worse when predicting out-of-distribution image responses compared to standard cross-validation. Prediction on an IID test set with no distribution shift had an r^2 = 0.5, while OOD prediction ranges from 0.48 (images with OOD contrast shift) to as low as 0.1 (images with OOD hue). This indicates a deep problem in modern models of the visual cortex—the promise of current image-computable models remains limited to the training image distribution.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×