The colors observers estimated for the dress stripes and the illumination precisely reflect the particular properties of the photo.
Figure 6a,
b illustrates the average adjustments for the dress body (blue dots) and the dress lace (brown dots) by simultaneous matching and by memory, respectively. When lumping the adjustments of the body and the lace together, the first principal component (red line) explained 91% and 89% of the variance of the adjustments for both stripes. The main variation captured by the principal component closely follows the variation of daylight (colored curve in
Figure 6a). Consequently, the variation of perceived dress colors across observers has the same pattern as the color distribution in the photo. This is in line with previous findings (Gegenfurtner et al.,
2015; Lafer-Sousa et al.,
2015; Winkler et al.,
2015).
However, the alignment of the principal component with the daylight locus is a necessary implication of the fact that the colors of the body and the lace differ along the daylight locus. What is important for the individual differences in perception is whether the observers' variation in perception of each of the stripes may be represented by one principal component that varies along the daylight locus. In other words, the question is: Do the adjustments of the blue stripes covary with the adjustments of the brown stripes? To show this, we need to consider the adjustment of each of the stripes separately and independently of each other. So, we calculated a six-dimensional principal component decomposition, that is, with three dimensions (L*, u*, v*) for each stripe (body and lace). It turns out that the first principal component still explained 62% of the variance for both the adjustments by simultaneous match and by memory. These principal components intersected all six dimensions of the matchings (loadings for L*, u*, and v* of lace: [0.25, 0.22, 0.33]; body: [0.15, 0.15, 0.86]) and the adjustments by memory (lace: [0.31, 0.25, 0.55]; body: [0.18, 0.16, 0.69]). This shows that body and lace adjustments covary with 62% common variance along a single dimension in that six-dimensional space.
This six-dimensional principal component is illustrated by the thin black lines in
Figure 6a,
b. These lines are actually two parts of the same principal component. Because both adjustments of the lace and the body are done in the same color space (CIELUV), we can plot the second and third dimension (u* and v* of the body) and the fifth and six dimension (u* and v* of the lace) into one graphic, resulting in the two black lines of the six-dimensional principal component. These two black lines nicely follow the curvature of the daylight locus. The location of colors along the daylight locus is captured by the correlated color temperature: A high correlated color temperature implies colors toward the blue (“lower”) end and a low correlated color temperature toward the yellow end of the daylight locus. Hence, the higher the correlated color temperature of the body adjustments, the higher the correlated color temperature of the lace adjustments. In other words, the observers' perception of the whole dress shifts along the daylight locus.
Further analyses of the simultaneous matches clarify the common pattern of adjustments that is represented by that first principle component. The lightness adjustments (L* axis) of the (blue/white) dress body were strongly correlated with all dimensions of the adjustments of the (black/gold) lace,
r(30) = 0.62,
p = 0.0002;
r(30) = 0.54,
p = 0.001;
r(30) = 0.66,
p < 0.0001. Moreover, the yellow-blue adjustments (v* axis) of the dress body were strongly correlated with the lightness adjustment (L*) of the lace,
r(30) = 0.62,
p = 0.0002. The important role of lightness in the estimation of the dress colors is also in line with previous findings (Gegenfurtner et al.,
2015).
Figure 6c illustrates the average adjustments of the illumination that observers believed reaches the dress (black dots). The principal component of these adjustments (red line) explained 97% of the variance and closely followed the daylight locus (colored curve). This result clearly shows that the observers assumed a color of the illumination in this photo somewhere along the daylight locus. Consequently, the interobserver variability of the estimated colors of the dress (see above) and of its illumination directly reflect the variation of color distributions (
Figure 1b) that are assumed to be the source of the ambiguity of the photo.
Figure 5c,
e illustrates the relative frequencies of the illumination naming. The illumination that reached the dress was called “white” most frequently (34%), closely followed by “blue” (31%) and then “yellow” (23%). In some cases, it was also called “gray” (10%) and rarely “purple” (3%). Observers did not use any other color term to describe the color of the illumination in this task. The color terms used closely reflect the variation of color adjustments along the daylight locus, too.
Based on these observations, we recoded the naming data into a
dress score that is more useful for the main analyses below. The dress score is a quasi-metric index of color naming that indicates whether observers' color naming was closer to black-blue or to white-yellow on the blue/dark versus white/light dimension, along which the perception of the dress varies (cf.
Figure 5d). Hence, white-gold and blue-black were considered as the two extrema, and these combinations were coded as 2 (1 for white and 1 for gold) and −2 (−1 for blue and −1 for black). Because “gold” corresponds to 1 and “blue” to −1, the combination “blue-gold” results in a value of 0. Because purple includes bluishness and bronze shares similarity with gold, these answers were coded as +0.5. Inversely, because brown is relatively dark but not yet black, it was given a value of −0.5. As a result, “blue-bronze,” “purple-brown,” and “blue-brown” corresponded to values of −0.5, −1, and −1.5 (cf.
Figure 5d).
A similar approach was used to convert the illumination naming data into
illumination scores (cf.
Figure 5e). Illumination scores varied between −1 for blue and +1 for yellow. White was coded as −0.5 because it refers to bright light but not exactly to the bright yellow light in the background. Gray and purple were coded +0.5 because they were in line with the idea that the dress is in a shadow or a bluish illumination. All intermediate values result from averaging the above values (−1, −0.5, 0.5, and 1) across the two repeated measurements.
Finally, all gray adjustments (disk and three versions of dress) varied along an axis approximately parallel to the daylight locus. In particular, we measured the chromaticity of the generic subjective gray point through gray disk adjustments. When observers adjusted the color of the disk (
Supplementary Figure S1a) to gray, the adjustments varied mainly along a principal component (86% of variance) close to the daylight locus (
Figure 6d). This is in line with previous measures of generic subjective gray points (Chauhan et al.,
2014; Witzel et al.,
2011).
The same pattern could be observed for the three versions of the dress: The first principal component of the adjustments of the dress in the context of the photo (
Supplementary Figure S1b), the dress with uniform hue (
Supplementary Figure S1b and S2), and the dress without background (
Supplementary Figure S1c) explained 86%, 92%, and 84% of variance across observers, respectively (cf. panel a of
Supplementary Figures S5–S7). These results reconfirm those shown for gray adjustments of objects in general (figure 6 in Witzel et al.,
2011) and support previous speculations that the white point of the dress is uncertain in chromaticity along the daylight locus (Gegenfurtner et al.,
2015; Lafer-Sousa et al.,
2015).