Abstract
Perceptual properties in the world vary wildly. Despite this, linguistic decisions are consistently made to describe the perceptual properties, generalizing across many possible values and contexts. We propose a model of color and the ways people describe it. We used Randall Monroe's crowdsourced corpus of human color judgments to model grounded representations of color labels. Participants were presented with a uniformly sampled color patch and allowed to freely label it. After controlling for factors such as nonsense and spam labels, the corpus consists of 100,000 participants, 829 color labels, and 1.6 million color value and label pairs. In keeping with semantic accounts of color vagueness (Williamson, 1996; Barker, 2002), we treat color judgments as a function of its possible boundaries. In other words, there is a boundary that distinguishes subjectively true instances of a color label, but where that boundary lies in a context is uncertain. For example, it is uncertain where green ends and blue begins. However, once a blue-green color is labeled green, color values more green than it are definitely green. Further, we model the uncertainty in label meaning as a problem of Bayesian inference by treating the uncertainty as the combination of the prior expectation of a color label with how subjectively true the color label is of a color patch. Support is shown by comparing our model against two alternative models with different assumptions about the underlying color representation. The first alternative assumption is that labels maintain a Gaussian distribution over possible color values. The second is that labels are memorized in a histogram model of the color value space. We found our model to outperform the others. In conclusion, our model captures how people label color, and further, offers a way to ground representations of linguistic meaning in the perceptual domain.
Meeting abstract presented at VSS 2014