Abstract
Given two quantitative models for the perceptual discriminability of stimuli that differ in some attribute, how can we determine which model is better? A direct method is to compare the model predictions with subjective evaluations over a large number of pre-selected examples from the stimulus space, choosing the model that best accounts for the subjective data. Not only is this a time-consuming and expensive endeavor, but for stimulus spaces of very high dimensionality (e.g., the pixels of visual images), it is impossible to make enough measurements to adequately cover the space (a problem commonly known as the “curse of dimensionality”).
Here we describe a methodology, Maximum Differentiation Competition, for efficient comparison of two such models. Instead of being pre-selected, the stimuli are synthesized to optimally distinguish the models. We first synthesize a pair of stimuli that maximize/minimize one model while holding the other fixed. We then repeat this procedure, but with the roles of the two models reversed. Subjective testing on pairs of such synthesized stimuli provides a strong indication of the relative strengths and weaknesses of the two models. Specifically, if a pair of stimuli with one model fixed but the other maximized/minimized are very different in terms of subjective discriminability, then the first (fixed) model must be failing to capture some important aspect of discriminability that is captured by the second model. Careful study of the stimuli may, in turn, suggest potential ways to improve a model or to combine aspects of multiple models.
To demonstrate the idea, we apply the methodology to several perceptual image quality measures. A constrained gradient ascent/descent algorithm is used to search for the optimal stimuli in the space of all images. We also demonstrate how these synthesized stimuli lead us to improve an existing model: the structural similarity index [Wang, Bovik, Sheikh, Simoncelli, IEEE Trans Im Proc 13(4), 2004].