August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Deep convolutional neural networks are sensitive to configural properties of faces
Author Affiliations & Notes
  • Virginia Strehle
    The University of Texas at Dallas
  • Natalie Bendiksen
    The University of Texas at Dallas
  • Alice O'Toole
    The University of Texas at Dallas
  • Footnotes
    Acknowledgements  Funding provided by National Eye Institute Grant R01EY029692-04 to AOT and CDC.
Journal of Vision August 2023, Vol.23, 5560. doi:https://doi.org/10.1167/jov.23.9.5560
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Virginia Strehle, Natalie Bendiksen, Alice O'Toole; Deep convolutional neural networks are sensitive to configural properties of faces. Journal of Vision 2023;23(9):5560. https://doi.org/10.1167/jov.23.9.5560.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep convolutional neural networks (DCNNs) have reached human-level accuracy in face recognition (Phillips et al., 2018), but whether these networks represent faces in ways comparable to humans is less known. People represent configural properties of faces more saliently than feature-based properties (e.g. Maurer et al., 2002). Although DCNNs may not be sensitive to configural properties of objects (Geirhos et al., 2018; Baker & Elder, 2022), this has not been demonstrated for faces. To test this, we compared the similarity of DCNN-generated representations of faces in which configural or feature-based information was altered. This is a method in the psychological literature used to illustrate the influence of configural and feature-based face processing (Tanaka & Sengco, 1997; Freire et al., 2000). Configural manipulations (changing the distance between the eyes, or the distance between the mouth and nose) and feature-based manipulations (replacing the eyes or mouth with another face’s eyes/mouth) were implemented in 48 faces, yielding 384 manipulated face images. Using the Inception-ResNet-v1 trained with VGGFace2 (Szegedy et al., 2017) and a high-performing DCNN trained for face identification (All-in-One, Ranjan et al., 2017), we measured perceived similarity of DCNN-generated representations by computing cosine similarity between inverse versions of each manipulation (i.e., face with eye distance increased vs. decreased). Both DCNN representations were altered more by configural than feature-based manipulations: Inception: Configuration M = 0.83, Feature-based M = 0.93, p < .001, Partial eta-squared = 0.37; All-in-One: Configuration M = 0.85, Feature-based M = 0.94, p < .001, Partial eta-squared = 0.46. By contrast to object recognition DCNNs, face DCNNs represent facial configuration, a finding consistent with the psychological literature. We speculate that the within-category nature of face identity training versus the category-based training of object DCNNs may account for the difference in whether configural information is represented.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×