September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
Deep networks trained to recognize facial expressions spontaneously develop representations of face identity
Author Affiliations & Notes
  • Kathryn C O’Nell
    Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
  • Rebecca Saxe
    Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
  • Stefano Anzellotti
    Department of Psychology, Boston College
Journal of Vision September 2019, Vol.19, 262. doi:https://doi.org/10.1167/19.10.262
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Kathryn C O’Nell, Rebecca Saxe, Stefano Anzellotti; Deep networks trained to recognize facial expressions spontaneously develop representations of face identity. Journal of Vision 2019;19(10):262. https://doi.org/10.1167/19.10.262.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

According to the dominant account of face processing, recognition of emotional expressions is implemented by the superior temporal sulcus (STS), while recognition of face identity is implemented by inferior temporal cortex (IT) (Haxby et al., 2000). However, recent patient and imaging studies (Fox et al., 2011, Anzellotti et al. 2017) found that the STS also encodes information about identity. Jointly representing expression and identity might be computationally advantageous: learning to recognize expressions could lead to the emergence of representations that support identity recognition. To test this hypothesis, we trained a deep densely connected convolutional network (DenseNet, Huang et al., 2017) to classify face images from the fer2013 dataset as either angry, disgusted, afraid, happy, sad, surprised, or neutral. We then froze the weights of the DenseNet and trained linear layers attached to progressively deeper layers of this net to classify either emotion or identity using a subset of the Karolinska (KDEF) dataset. Finally, we tested emotion and identity classification in left out images in the KDEF dataset that were not used for training. Classification accuracy for emotions in the KDEF dataset increased from early to late layers of the DenseNet, indicating successful transfer across datasets. Critically, classification accuracy for identity also increased from early to late layers of this DenseNet, despite the fact that it had not been trained to classify identity. A linear layer trained on the DenseNet features vastly outperformed a linear layer trained on pixels (98.8% vs 68.7%), demonstrating that the high accuracy obtained with the DenseNet features cannot be explained by low-level confounds. These results show that learning to recognize facial expressions can lead to the spontaneous emergence of representations that support the recognition of identity, thus offering a principled computational account for the discovery of expression and identity representations within the same portion of STS.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×