Purchase this article with an account.
Linjie Li, Amanda Song, Vicente Malave, Garrison Cottrell, Angela Yu; Extracting Human Face Similarity Judgments: Pairs or Triplets?. Journal of Vision 2016;16(12):719. doi: 10.1167/16.12.719.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Understanding how humans assess facial similarity is an important problem for both machine learning applications and cognitive science. Two classical techniques for extracting human similarity judgments are in broad use, pairwise rating and triplet ranking. While there are obvious algorithmic consequences based on these methods, there has been little explicit comparison of the informational utility of the two. Here, we present face similarity judgment data, in both pairwise and triplet forms, collected on Amazon Mechanical Turk. We demonstrate that triplet data are more informative of both individual judgments and heterogeneity among individuals. In the experiment, we present seven faces to each subject in both formats, for a total of 35 triplets and 21 pairs, repeated 4 times. We use the 10k US Adult Faces database provided by Aude Oliva's group at MIT. We convert the pairwise data into equivalent triplet data, and then use identical measures to compare the two. We first compute the cross-correlation over repeated responses for each triplet, and find that the self-consistency is higher for triplet data. Moreover, cross-correlation of responses across subjects for the same triplets suggests distinct subgroups of individuals, and this multi-cluster pattern is less evident in the equivalent pairwise data. We further propose a statistical model to quantify the information gain of both methods and our results suggest that triplets indeed provide more information than pairs. Overall, triplet ranking is more informative than pairwise rating for eliciting facial similarity judgments from humans. It has often been observed that humans give more self-consistent responses when reporting relative preferences than assigning numeric values to individual items, especially in complex judgments involving high-dimensional inputs. Apparently, forcing humans to assign numerical values to complex judgments can not only make them appear less consistent, but also can corrupt the information available in simpler relative ranking responses.
Meeting abstract presented at VSS 2016
This PDF is available to Subscribers Only