December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Not so fast: Limited validity of deep convolutional neural networks as in silico models for human naturalistic face processing
Author Affiliations & Notes
  • Guo Jiahui
    Center for Cognitive Neuroscience, Dartmouth College, NH, USA 03755
  • Ma Feilong
    Center for Cognitive Neuroscience, Dartmouth College, NH, USA 03755
  • Matteo Visconti di Oleggio Castello
    Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA 94720
  • Samuel A. Nastase
    Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA 08544
  • James V. Haxby
    Center for Cognitive Neuroscience, Dartmouth College, NH, USA 03755
  • M. Ida Gobbini
    Cognitive Science, Dartmouth College, NH, USA 03755
    Dipartimento di Medicina Specialistica, Diagnostica e Sperimentale, Università di Bologna, Bologna, Italy 40138
  • Footnotes
    Acknowledgements  This work was supported by NSF grants 1607845 (J.V.H) and 1835200 (M.I.G).
Journal of Vision December 2022, Vol.22, 3714. doi:https://doi.org/10.1167/jov.22.14.3714
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Guo Jiahui, Ma Feilong, Matteo Visconti di Oleggio Castello, Samuel A. Nastase, James V. Haxby, M. Ida Gobbini; Not so fast: Limited validity of deep convolutional neural networks as in silico models for human naturalistic face processing. Journal of Vision 2022;22(14):3714. https://doi.org/10.1167/jov.22.14.3714.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep convolutional neural networks (DCNNs) trained for face identification can rival and even exceed human-level performance. The relationships between internal representations learned by DCNNs and those of the primate face processing system are not well understood, especially in naturalistic settings. We developed the largest naturalistic dynamic face stimulus set in human neuroimaging research (700+ naturalistic video clips of unfamiliar faces) and used representational similarity analysis to investigate how well the representations learned by high-performing DCNNs match human brain representations across the entire distributed face processing system. DCNN representational geometries were strikingly consistent across diverse architectures and captured meaningful variance among faces. Similarly, representational geometries throughout the human face network were highly consistent across subjects. Nonetheless, correlations between DCNN and neural representations were very weak overall—DCNNs captured 3% of variance in the neural representational geometries at best. Intermediate DCNN layers better matched visual and face-selective cortices than the final fully-connected layers. Behavioral ratings of face similarity were highly correlated with intermediate layers of DCNNs, but also failed to capture representational geometry in the human brain. Our results suggest that the correspondence between intermediate DCNN layers and neural representations of naturalistic human face processing is weak at best, and diverges even further in the later fully-connected layers. This poor correspondence can be attributed, at least in part, to the dynamic and cognitive information that plays an essential role in human face processing but is not modeled by DCNNs. These mismatches indicate that current DCNNs have limited validity as in silico models of dynamic, naturalistic face processing in humans.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×