Abstract
Certain face images are considered a better “likeness” of an identity than others. However, it remains unclear whether perceived likeness is driven by: a) the proximity of a face image to an averaged prototype comprising all instances of an identity (i.e., a “central prototype”), or b) how closely a face image approximates the most-common appearances of the identity (i.e., the “density” of known instances). Convolutional neural networks (CNNs) trained for face recognition can be used to instantiate face spaces that model both within- and between-identity variability. These networks provide a testbed for investigating human perceptions of face likeness. We compared human-assigned likeness ratings with likeness ratings based on identity-prototypes and local image density from a face-identification CNN. Participants (n=50) viewed 20 face images simultaneously (5 viewpoints x 4 illumination conditions) of each of 72 identities and adjusted a slider bar to indicate whether each image was a “good likeness” of the identity being shown. Responses were collapsed across participants to generate a single likeness rating per image. Next, we used face-image descriptors from a CNN to generate likeness ratings based on either the proximity of a descriptor from the “prototype” created by averaging all corresponding same-identity descriptors (prototype-proximity likeness) or the density of same-identity descriptors located around a given descriptor in the output space generated by the CNN (density likeness). For all measures of perceived likeness (human-assigned ratings, prototype-proximity and density), likeness differed across viewpoint (p < 0.0001) and illumination (p < 0.001). However, only the CNN density-based likeness ratings mirrored the pattern of human likeness ratings. These results demonstrate that density within an identity-specific face space is a better model of human-assigned perceived-likeness ratings than distance to an identity prototype. In addition, these results show that viewpoint and illumination influence perceived-likeness ratings for face images.