August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Comparing Humans and Deep Neural Networks on face recognition under various distance and rotation viewing conditions
Author Affiliations & Notes
  • Michal Fux
    MIT
  • Suayb S Arslan
    MIT
  • Hojin Jang
    MIT
  • Xavier Boix
    MIT
  • Avi Cooper
    MIT
  • Matt J Groth
    MIT
  • Pawan Sinha
    MIT
  • Footnotes
    Acknowledgements  This research is supported by ODNI, IARPA. The views are of the authors and shouldn't be interpreted as representing official policies of ODNI, IARPA, or the U.S. Gov., which is authorized to reproduce & distribute reprints for governmental purposes notwithstanding any copyright annotation therein
Journal of Vision August 2023, Vol.23, 5916. doi:https://doi.org/10.1167/jov.23.9.5916
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Michal Fux, Suayb S Arslan, Hojin Jang, Xavier Boix, Avi Cooper, Matt J Groth, Pawan Sinha; Comparing Humans and Deep Neural Networks on face recognition under various distance and rotation viewing conditions. Journal of Vision 2023;23(9):5916. https://doi.org/10.1167/jov.23.9.5916.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Humans possess impressive skills for recognizing faces even when the viewing conditions are challenging, such as long ranges, non-frontal regard, variable lighting, and atmospheric turbulence. We sought to characterize the effects of such viewing conditions on the face recognition performance of humans, and compared the results to those of DNNs. In an online verification task study, we used a 100 identity face database, with images captured at five different distances (2m, 5m, 300m, 650m and 1000m) three pitch values (00 - straight ahead, +/- 30 degrees) and three levels of yaw (00, 45, and 90 degrees). Participants were presented with 175 trials (5 distances x 7 yaw and pitch combinations, with 5 repetitions). Each trial included a query image, from a certain combination of range x yaw x pitch, and five options, all frontal short range (2m) faces. One was of the same identity as the query, and the rest were the most similar identities, chosen according to a DNN-derived similarity matrix. Participants ranked the top three most similar target images to the query image. The collected data reveal the functional relationship between human performance and multiple viewing parameters. Nine state-of-the-art pre-trained DNNs were tested for their face recognition performance on precisely the same stimulus set. Strikingly, DNN performance was significantly diminished by variations in ranges and rotated viewpoints. Even the best-performing network reported below 65% accuracy at the closest distance with a profile view of faces, with results dropping to near chance for longer ranges. The confusion matrices of DNNs were generally consistent across the networks, indicating systematic errors induced by viewing parameters. Taken together, these data not only help characterize human performance as a function of key ecologically important viewing parameters, but also enable a direct comparison of humans and DNNs in this parameter regime.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×