Abstract
Human face perception, one of our seemingly most automatic cognitive abilities, has long been of interest to computer vision scientists. Most of the comparisons of humans and computer vision algorithms to date focused on accuracy, but few examined mechanisms underlying this ability. The current study explores the relationship between human judgments of face similarity and algorithmic estimates of perceptual similarity.
Participants (N = 141, Mage = 27.2, 73 female) completed a face similarity judgment task (presentation randomized between same and different pairs) as well as a battery of standard face perception tasks. They were asked to rate the similarity of pairs of faces between 0 (very dissimilar) and 100 (very similar) and indicate whether the pair was of the same person or different people. A medium-sized correlation between similarity judgments given by humans and algorithms was observed (r = 0.44, p < 0.01). More interestingly, participant’s deviation from algorithmic similarity was a significant predictor of their performance on a variety of other traditional face-recognition measures (F(2, 138) = 2.682, p < 0.05, R2 = 0.063). Furthermore, this relationship was only significantly predicted by deviation of similarity judgments from algorithmically-derived similarity (r = -0.22, p < 0.05) and not by overall variability of individual’s responses (r = -0.12, p = 0.20) or their response times (r = 0.05, p = 0.59).
This finding suggests that the ability to objectively assess the similarity of two faces might be a crucial underpinning in face recognition mechanisms captured by a variety of established face tests. Furthermore, similarities derived from deep-neural-net algorithms seemingly capture an important aspect of face similarity that humans rely on in a variety of tasks. We discuss potential implications for future research directions, particularly in explaining atypical mechanisms of high-level perception in autism or developmental prosopagnosia.