Abstract
The human perceptual system can make complex inferences on faces, ranging from the objective evaluations regarding gender, ethnicity, expression, age, identity, etc. to subjective judgments on facial attractiveness, trustworthiness, sociability, friendliness, etc. Whereas the objective aspects have been extensively studied, less attention has been paid to modeling the subjective perception of faces. Here, we adapt 6 state-of-the-art neural networks pretrained on various image tasks (object classification, face identification, face localization) to predict human ratings on 40 social judgments of faces in the 10k US Adult Face Database. Supervised ridge regression on PCA of the conv5 2 layer in VGG-16 network gives best predictions on the average human ratings. Human group agreement was evaluated by repeatedly randomly splitting the raters into two halves for each face, and calculating the Pearson correlation between the two sets of averaged ratings. Due to this methodology, the models correlations with the average human ratings can exceed this score. We find that 1) model performance grows as the consensus on a face trait increases, and 2) model correlations are always higher than human correlations with each other. These results illustrate the learnability of the subjective perception of faces, especially when there is consensus, and the striking versatility and transferability of representations learned for object recognition. This work has strong applications to social robotics, allowing robots to infer human judgments of each other.
Meeting abstract presented at VSS 2017