September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Training deep learning algorithms for face recognition with large datasets improves performance but reduces similarity to human representations
Author Affiliations & Notes
  • Nitzan Guy
    Tel Aviv University
  • Mandy Rosemblaum
    Barnard College
  • Adva Shoham
    Kavli Institute for Neuroscience at Yale
  • Galit Yovel
    Professor, Kansas State University
  • Footnotes
    Acknowledgements  ISF 917/21
Journal of Vision September 2024, Vol.24, 1404. doi:https://doi.org/10.1167/jov.24.10.1404
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nitzan Guy, Mandy Rosemblaum, Adva Shoham, Galit Yovel; Training deep learning algorithms for face recognition with large datasets improves performance but reduces similarity to human representations. Journal of Vision 2024;24(10):1404. https://doi.org/10.1167/jov.24.10.1404.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The perceptual representation of facial identity is influenced by factors such as familiarity and experience variability. Yet the impact of overall experience with faces - specifically, the number of identities, amount of exposure for each identity, and the variability of head pose - on the nature of face representations remains unknown. In this work, face-trained deep neural networks were used to answer these questions by manipulating the number of identities, exposure for each identity, and variations in head pose during model training. Model quality evaluation included accuracy assessment with a standard face benchmark (Labeled Face in the Wild dataset), testing for human-like face effects (e.g., the inversion effect), and examining the correlation of models with human similarity representations. Our findings reveal that the number of identities and images per identity significantly influenced model performance. Intriguingly, while increased experience improves accuracy, correlations with human representations were higher for models trained with limited experience (e.g., models trained on only 500 identities and 300 images per identity) compared to models with extensive experience (e.g., CLIP and VGG16 trained on Vggface2 dataset that includes more than 8000 identities and hundreds of images per identity). With respect to head pose, we limited training to poses that varied between frontal to 20 degrees (frontal-only model) and compared to a model that was trained on 25-45 degrees (three-quarter-only model). Whereas both models reached similar performance level, the frontal model was more similar to human representations than the three-quarter model. Taken together, our findings show that similarity between face-trained DNN and human representations does not correspond with model performance, and may not require extensive training with the large datasets of faces that are commonly used to train deep learning models.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×