October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
Using task-optimized neural networks to understand why brains have specialized processing for faces
Author Affiliations & Notes
  • Katharina Dobs
    Massachusetts Institute of Technology
  • Alexander JE Kell
    Columbia University
  • Julio Martinez
    Massachusetts Institute of Technology
  • Michael Cohen
    Massachusetts Institute of Technology
    Amherst College
  • Nancy Kanwisher
    Massachusetts Institute of Technology
  • Footnotes
    Acknowledgements  This work was supported by a Feodor Lynen Fellowship of the Humboldt foundation to K.D., NIH grant Grant DP1HD091947 to N.K and National Science Foundation Science and Technology Center for Brains, Minds, and Machines.
Journal of Vision October 2020, Vol.20, 660. doi:https://doi.org/10.1167/jov.20.11.660
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Katharina Dobs, Alexander JE Kell, Julio Martinez, Michael Cohen, Nancy Kanwisher; Using task-optimized neural networks to understand why brains have specialized processing for faces. Journal of Vision 2020;20(11):660. doi: https://doi.org/10.1167/jov.20.11.660.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Category-selective regions are a prominent feature of the ventral visual pathway. Why is there specialization for some categories (e.g., faces, scenes), but not others (e.g., food, cars)? And why does functional specialization arise in the first place? Here, we used deep convolutional neural networks (CNNs) to test the hypothesis that face-specific regions are segregated from object regions because face and object recognition require different representations and computations. We trained two separate AlexNet networks to categorize either faces or objects. The face-trained CNN significantly outperformed the object-trained CNN on face categorization on held-out identities and vice versa, demonstrating that the representations optimized for one task are suboptimal for the other. To determine whether representations could be learned to simultaneously support both tasks, we trained dual-task CNNs with a branched architecture, varying the number of layers that were shared between face and object tasks (i.e., early vs. late branch points; Kell et al., 2018). We found that dual-task networks sharing late layers performed worse than CNNs trained on only faces or only objects. However, dual-task networks sharing only early processing stages, presumably like the primate visual system, showed no cost of sharing. Do these results generalize to architectures with larger capacity? We trained VGG16 networks on the same tasks. Surprisingly, in this case, even the fully-shared dual-task CNN performed as well as the separate networks. However, lesion experiments showed that segregation of face and object processing had emerged spontaneously in the dual-task network. Critically, a dual-task network optimized for food and object categorization showed less task segregation. These results suggest that functional specialization in the brain exists for faces but not for food because food and object categorization can be performed by relying on common representations while face and object recognition rely on inherently different computations.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.