December 2016
Volume 16, Issue 15
Open Access
Article  |   December 2016
Uncertainty of sensory signal explains variation of color constancy
Author Affiliations
Journal of Vision December 2016, Vol.16, 8. doi:10.1167/16.15.8
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Christoph Witzel, Carlijn van Alphen, Christoph Godau, J. Kevin O'Regan; Uncertainty of sensory signal explains variation of color constancy. Journal of Vision 2016;16(15):8. doi: 10.1167/16.15.8.

      Download citation file:


      © 2017 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

Color constancy is the ability to recognize the color of an object (or more generally of a surface) under different illuminations. Without color constancy, surface color as a perceptual attribute would not be meaningful in the visual environment, where illumination changes all the time. Nevertheless, it is not obvious how color constancy is possible in the light of metamer mismatching. Surfaces that produce exactly the same sensory color signal under one illumination (metamerism) may produce utterly different sensory signals under another illumination (metamer mismatching). Here we show that this phenomenon explains to a large extent the variation of color constancy across different colors. For this purpose, color constancy was measured for different colors in an asymmetric matching task with photorealistic images. Color constancy performance was strongly correlated to the size of metamer mismatch volumes, which describe the uncertainty of the sensory signal due to metamer mismatching for a given color. The higher the uncertainty of the sensory signal, the lower the observers' color constancy. At the same time, sensory singularities, color categories, and cone ratios did not affect color constancy. The present findings do not only provide considerable insight into the determinants of color constancy, they also show that metamer mismatch volumes must be taken into account when investigating color as a perceptual property of objects and surfaces.

Introduction
Color as a perceptual attribute of objects and surfaces requires color constancy. Human observers are able to judge whether a fruit is ripe merely based on its color, no matter whether the fruit is seen in bright sunlight, shadow, a canopy of trees, or under the artificial light of the incandescent bulb of a kitchen. This is possible despite the fact that in these different environments the light does not only strongly vary in brightness, but also in color: Bright sunlight is yellowish-white, shadow bluish, light under a canopy of trees greenish, and the light of an incandescent bulb, reddish yellow. Color constancy is the ability to recognize the color of an object (or more generally of a surface) under different illuminations. Color constancy is a fundamental characteristic of human color vision since it makes it possible for human beings to recognize objects based on their color in a constantly changing visual environment (Hurlbert, 2007). 
At a closer look color constancy turns out to be a curious phenomenon from a scientific point of view. When light is reflected off a surface it combines spectral information from the illumination (i.e., the illuminant spectrum, or illuminant) and from the surface (i.e., the reflectance of the surface). Moreover, when this light reaches the human eye it is encoded through three photoreceptors whose sensitivities peak in different parts of the visible spectrum, the long- (L), middle- (M), and short-wavelength (S) cones. Color vision is based on the information contained in the excitation of these three receptors, the LMS signal or, more generally, the sensory signal. However, this sensory signal does not contain the complete information about the spectra of the impinging light. In fact, the same sensory signal may be produced by lights with different spectra (Figure 1a, b). This phenomenon is called metamerism: Lights that produce the same sensory signal are metameric. In addition, the spectrum of the light impinging on the eye may be the result of different combinations of illuminations and reflectances. As a consequence, a sensory signal arising for a specific surface under a particular illumination may also be produced by a variety of other illuminants and reflectances. 
Figure 1
 
Metamer mismatching. Panel a shows the reflectances of two example surfaces that are metameric under illuminant 1. Metamerism means that these reflectances produce exactly the same color under illuminant 1 (Panel b). However, they result in different colors under illuminant 2 (Panel c); this is called metamer mismatching. For illustrations of the effects of metamer mismatching in real life and practical contexts see the section “The uncertainty of surface colors” of the Discussion.
Figure 1
 
Metamer mismatching. Panel a shows the reflectances of two example surfaces that are metameric under illuminant 1. Metamerism means that these reflectances produce exactly the same color under illuminant 1 (Panel b). However, they result in different colors under illuminant 2 (Panel c); this is called metamer mismatching. For illustrations of the effects of metamer mismatching in real life and practical contexts see the section “The uncertainty of surface colors” of the Discussion.
However, when the illumination changes, these illuminants and reflectances will not all produce the same sensory signal anymore (Figure 1c). This phenomenon is called metamer mismatching (Wyszecki & Stiles, 1982): The set of reflectances that produced only one sensory signal under a given illuminant produces a set of sensory signals under another illuminant. The size of this set is called the metamer mismatch volume (Logvinenko, Funt, & Godau, 2014). 
Metamer mismatch volumes play an important role in color constancy when we consider that in general an observer only receives LMS values and so cannot know what the reflectance or the illuminant is. Without knowledge about reflectance and illuminants, it is uncertain how the sensory signal changes from one illumination to another. Consequently, it is not obvious how color constancy is possible given the problem of metamer mismatching (Logvinenko, Funt, Mirzaei, & Tokunaga, 2015). 
Background
Despite the uncertainty of the sensory signal, observers show a certain degree of color constancy, i.e., stability in identifying a color across illuminations. There are several explanations of this stability. Adaptation to the illuminant color is a powerful mechanism that discounts shifts in the sensory signal as far as they align with shifts of the color of the illumination (Smithson, 2005). Cues such as “brightest is white,” the “gray-world assumption,” or effects of “local contrast” may also contribute to discount the color shift due to the illumination change (e.g., Hansen, Walter, & Gegenfurtner, 2007; Kraft & Brainard, 1999). 
However, these cues and effects are rather heuristic since they fail once the brightest surface in a scene is not white, the average color of the scene is not achromatic gray, and when color contrast shifts the color of the surface in a way that is not in line with color constancy (Zaidi & Bostic, 2008). Furthermore, color constancy research has observed that performance in color constancy is extremely variable, depending on the task and instructions, observers, the scene, the setup and the stimulus colors (Foster, 2011; Radonjic & Brainard, 2016). Some of the most recent studies claim that color constancy is perfect (Gegenfurtner, Bloj, & Weiß, 2015), while others wonder whether color constancy exists at all (Foster, 2003). Up until now, however, the source of the strong variability in measures of color constancy across setups and observers has not been clarified. 
Strong variability in color constancy is not surprising at all if we take metamer mismatching into account. In this case, there is no single optimal solution to the color constancy problem, but a whole metamer mismatch volume (Logvinenko et al., 2015). Since changes in the sensory signal across illuminations may vary within these volumes, observers cannot recognize a certain surface across illuminations based on its color. 
To illustrate this, imagine seeing a yellow surface under a given illumination, such as the one in Figure 1b. The yellow color under this illumination is represented as the fat black disk in the color space of Figure 2a. This yellow color under this illumination could result from any reflectance that is metameric under this illumination, for example, from a natural surface or a standard Munsell color chip. When you change the illumination (Figure 1c) the natural surface and the Munsell chip result in different sensory signals and hence different colors. This is illustrated by the red and green disks in Figure 2a. However, these two disks are just examples that correspond to only two candidate surfaces. The one color under the first illumination may result from a whole set of candidate metameric surfaces, and each of these different surfaces results in a different color when the illumination changes. The ensemble of all these different colors under the second illumination describes the metamer mismatch volume (cf. colored area in Figure 2a). 
Figure 2
 
Illustration of metamer mismatch volumes. Two examples of metamer mismatch volumes are shown in CIELUV space. Metamer mismatch volumes are projected on the u*v* plane (metamer mismatch area) since lightness was fixed in the experiment. Panel a shows the metamer mismatch area of a yellowish green surface (Munsell chip 7.5Y4/6); Panel b shows the one of a purplish red surface (7.5PB5/8). The illuminant change corresponds with the change from the yellowish (D50) to the blueish (CCT of 12000K) illuminant used in the experiment. The black disk in each panel indicates the sensory signal under the original yellowish illumination. The red and the green disks correspond to the sensory signal under the blue illuminant of a Munsell chip (red) and of a prototypical natural surface (green) that are metameric under the yellow illuminant (black disk). Small white disks show average adjustments of observers, the gray line represents the first principal component of those adjustments, and the percentage in the upper right corner indicates the variance explained by the first principal component. The graphics for the other colors and conditions may be found in Figure S1 and Figure S2 of the Supplementary Material.
Figure 2
 
Illustration of metamer mismatch volumes. Two examples of metamer mismatch volumes are shown in CIELUV space. Metamer mismatch volumes are projected on the u*v* plane (metamer mismatch area) since lightness was fixed in the experiment. Panel a shows the metamer mismatch area of a yellowish green surface (Munsell chip 7.5Y4/6); Panel b shows the one of a purplish red surface (7.5PB5/8). The illuminant change corresponds with the change from the yellowish (D50) to the blueish (CCT of 12000K) illuminant used in the experiment. The black disk in each panel indicates the sensory signal under the original yellowish illumination. The red and the green disks correspond to the sensory signal under the blue illuminant of a Munsell chip (red) and of a prototypical natural surface (green) that are metameric under the yellow illuminant (black disk). Small white disks show average adjustments of observers, the gray line represents the first principal component of those adjustments, and the percentage in the upper right corner indicates the variance explained by the first principal component. The graphics for the other colors and conditions may be found in Figure S1 and Figure S2 of the Supplementary Material.
Now assume you see a color under a given illumination, as represented by the black disk and you have to guess how this color looks under another illumination. The problem is that, due to the way color is processed in the visual system (i.e., through trichromacy and univariance), a human observer cannot know the reflectance properties of a given surface from seeing a color. Without prior knowledge about the surface that produced the color under the first illumination (Figure 1b and black disk in Figure 2a), you cannot predict which color within the mismatch volume results when the illumination changes, and hence the change of the sensory signal is highly uncertain. The metamer mismatch volume describes the uncertainty of an observer about the color under changing illuminations if the observer does not have any prior knowledge about the candidate surfaces. 
Consequently, observers' estimations of colors across illuminations are expected to vary across the metamer mismatch volumes. Moreover, different colors correspond to different metamer mismatch volumes, and hence, to different levels of uncertainty about the color under illumination change. For example, the metamer mismatch volume of the yellow color in Figure 2a differs from the metamer mismatch volume for the purple-red color in Figure 2b. Consequently, studies that involve different colors and different illumination changes imply different levels of uncertainty. 
Yet, it is not clear whether metamer mismatching directly affects performance in color constancy. It might be that the human visual system implements certain color constancy mechanisms and heuristics that lead the visual system toward one particular solution of color constancy that is one point in the metamer mismatch volume. Such a solution works quite well in cases in which the reflectance and illumination shift end up close to that solution, but not when they shift the sensory signal to any other point in the metamer mismatch volume. To some extent, this must be the effect of adaptation and local contrast, and it is undeniable that these effects play an important role in color constancy. However, they do not completely explain color constancy, and the question arises whether color constancy performance is directly related to the uncertainty about the sensory signal due to metamer mismatching. 
Objective
To clarify the role of metamer mismatching in color constancy, we investigated whether metamer mismatching predicts the variation of performance in color constancy that cannot be attributed to adaptation and local contrast. In particular, we tested whether performance in color constancy across different colors depends on the different sizes of the metamer mismatch volumes for each color. 
Apart from metamer mismatching, sensory singularities are another candidate determinant of the variation of color constancy across colors (Philipona & O'Regan, 2006; Witzel, Cinotti, & O'Regan, 2015). Sensory singularity refers to the dimensionality of the linear function that maps the sensory signal that results from looking directly at the light of the illumination (illuminant signal), to the sensory signal that corresponds to the light reflected off a particular surface (reflected signal). When surfaces have singular reflectance properties, the reflected signals are more predictable across illumination changes because the reflection by such a surface maps the illuminant signal onto only one or two dimensions of three dimensional LMS-space (Philipona & O'Regan, 2006; Witzel et al., 2015). There is strong evidence that surfaces corresponding to typical red, yellow, green, and blue are more singular, hence producing a more predictable reflected signal across illumination changes (Philipona & O'Regan, 2006; Vazquez-Corral, O'Regan, Vanrell, & Finlayson, 2012; Witzel & O'Regan, 2014). At the same time, it is not clear whether and how “sensory singularities” are related to color constancy (Witzel et al., 2015). 
Moreover, research on the constancy of color categories across illuminations in adults and toddlers suggests a relationship between color naming and color constancy (Olkkonen, Hansen, & Gegenfurtner, 2009; Olkkonen, Witzel, Hansen, & Gegenfurtner, 2010; Witzel, Flack, & Franklin, 2013; Witzel, Sanchez-Walker, & Franklin, 2013). Color categories are the ensembles of colors that correspond to a color term, such as “red,” “yellow,” “green,” and “blue.” It has been found that the consistency of categorization across illuminations is extremely similar to the consistency of categorization across observers. However, measurements of category consistency intermingle color constancy with the strength of category membership across colors. For this reason, it is not clear whether the observed relationship is the result of different degrees of color constancy for different colors, or of linguistic categorization. To test the idea that the typical colors of categories have particularly high perceptual color constancy, color constancy and color categories need to be measured independently from each other. 
Furthermore, there is evidence that color constancy across colors can be predicted by cone ratios (Nascimento, de Almeida, Fiadeiro, & Foster, 2004; Nascimento & Foster, 1997). Cone ratios are the ratios between cone excitations produced by light reflected from a pair of surfaces (Foster & Nascimento, 1994; Foster, 2011). 
Taken together, the present study investigated whether color constancy across colors is related to metamer mismatching, sensory singularities, linguistic color categories, and cone ratios. For this purpose, we measured performance in color constancy across different colors, and compared it with the predictions based on metamer mismatch volumes, sensory singularities, color categories, and cone ratios. Finally, we also tested still other candidate predictors of color constancy, such as adaptation, and lightness and chroma of the stimulus colors, and we estimated how well the aforementioned factors together could explain the observers' color constancy performance. Preliminary results of this study have been presented in a conference paper (Van Alphen, Witzel, Godau, & O'Regan, 2015). 
Method
Color constancy was measured with an asymmetric, simultaneous matching task (Arend & Reeves, 1986). Three features of our method were fundamental for the present investigation. 
First, while adaptation is an important determinant of color constancy (e.g., Hansen et al., 2007; Kraft & Brainard, 1999), the present study focuses on determinants of color constancy beyond adaptation. This is particularly true for metamer mismatch volumes and sensory singularities because they are based on the raw LMS signal, not on cone-contrast or other approaches to account for adaptation. At the same time, we wanted to avoid effects of memory that are likely in paradigms involving consecutive presentations. For these reasons, we examined color constancy in a simultaneous matching task. In particular, the task consisted of adjusting the color of an object in one image of a scene so that it corresponds to the color of the same object in a second image of the same scene, but that was differently illuminated (cf. Figure 3). In such a task, adaptation cannot fully account for color constancy because adaptation is the same for the test and comparison scenes (Arend & Reeves, 1986). This task also avoids effects of memory through simultaneous presentation. 
Figure 3
 
Stimulus display. Participants had to adjust the patch with the black dot so that it matches the corresponding patch in the scene with the other illumination. The starting color for the adjustment was random.
Figure 3
 
Stimulus display. Participants had to adjust the patch with the black dot so that it matches the corresponding patch in the scene with the other illumination. The starting color for the adjustment was random.
Second, it was crucial that the task was a proper color constancy task, in the sense that the correspondence between colors across scenes is clearly due to a change of illumination. Artificial stimuli rendered on the computer screen, such as Mondrian-scenes or single patches on colored backgrounds, require the observer's imagination to understand that color changes are due to a change of illuminants (see also Radonjic & Brainard, 2016). In such cases, compensatory shifts in the perception of the sensory signal across simulated illuminations are not necessarily due to color constancy. Instead, they could also result from simultaneous color contrast alone. As with adaptation, it is clear that simultaneous contrast plays a role in color constancy (e.g., Kraft & Brainard, 1999), but we were interested in factors beyond local contrast. The rationale for the choice of our stimulus display is that the observer solves a task that involves a comparison of photos under different illuminations as it could occur in an everyday life situation. For this reason, we made a major effort to produce images of scenes that are photorealistic. In this way, the correspondence between surfaces across images is obvious by the fact that they appear to be photos of the same scene under different illuminations. At the same time, the images of scenes should still allow for controlled rendering of colors based on reflectances. Moreover, they should not contain color diagnostic objects because we also wanted to control for possible effects of memory colors on color constancy (Granzier & Gegenfurtner, 2012; Kanematsu & Brainard, 2013). Hence, the images used in the present study attempted to combine photorealism, the control of color rendering, and the absence of color diagnostic objects (cf. Figure 3, Figure 4). 
Figure 4
 
Types of scenes. Images were taken from Mayang's Free Textures Library (Smith & Adnin, 2015).
Figure 4
 
Types of scenes. Images were taken from Mayang's Free Textures Library (Smith & Adnin, 2015).
Finally, in order to compare measures of color constancy across colors, it is indispensable to account for variations in the measurements that were merely due to an insufficient control of color metrics or other factors unrelated to color constancy. For this reason, we added a control condition where the two scenes to be matched had identical illumination. This condition provided a measure of inherent variability in color adjustments that were not due to color constancy. By accounting for this variability in our measurements, we isolated variability that was specific to color constancy. 
Color naming was measured in two ways. In a simple color naming task, colors were shown as a disk on a uniform background without any other color patches or context. Since sensory singularities do not depend on the context, a relationship between color constancy and sensory singularities should be revealed when using this kind of simple color naming. However, it is known that color appearance and color naming can be influenced by context, such as local contrast and adaptation (Hansen et al., 2007). To more generally investigate the relationship between color constancy and color naming, we also conducted a supplementary naming task. In this task, exactly the same stimulus displays (cf. Figure 3) were used as in the color constancy task in order to maximize comparability. 
Participants
Twenty observers (15 women, average age: 24 ± 12 years) took part in the main experiment (asymmetric matching and simple color naming, cf. Procedure). Ten of these observers and 11 new observers participated in the supplementary color naming (16 women, average age: 33 ± 15 years). 
None of the observers had red-green color deficiencies, as verified by Ishihara plates (Ishihara, 2004). Participants were recruited by an email list (CNRS, 2015), and were paid for participation. All participants gave informed consent before they started the experiment. Experimental procedures were approved by the Institutional Review Board Comité d'éthique de la recherche en santé (CERES; Human Research Ethics Committee) of the Paris Descartes University (application nr. 2015/35). 
Apparatus
Stimuli were displayed on a ViewSonic PN5f+ CRT monitor driven by a NVIDIA GeForce 8400 GS graphics card (NVIDIA Corporation, Santa Clara, CA) with a color resolution of eight bits per channel, a spatial resolution of 1280 × 1024 pixels (at a size of 36.5 × 27 cm), and a refresh rate of 85 Hz. CIE1931 chromaticity coordinates and luminance of the monitor primaries were R = (0.615, 0.351, 14.4), G = (0.295, 0.600, 43.5), and B = (0.144, 0.076, 5.16). Gamma corrections without bit loss were applied based on the measured gamma curves of the monitor primaries. Observers looked at the screen from a distance of about 50 cm. 
Stimuli
Scenes
Four photos of scenes without color-diagnostic objects were retrieved from Mayang's Free Textures Library (Smith & Adnin, 2015). Scenes consisted of a patterned gray background with 12 objects (stones, tiles, etc.) of different colors (cf. Figure 4). One of the objects showed the target color, the other 11 objects showed distractor colors. The 12 objects were colored according to the stimulus reflectances under the respective illumination. All other areas of the scene were colored according to the neutral gray chip under the respective illumination (see sections Reflectances and Illuminants as follows). 
To control for particularities of the scenes, scenes were randomly assigned to each trial for each observer. In particular, for each trial (and hence each illumination and target color condition) a scene was randomly chosen among the four scenes. The 11 distractor colors and the target color for a given condition were randomly assigned to one of the 12 objects. For each participant and each session a new stimulus set was produced with individual randomization of scenes and distractors. 
The images of the two scenes in a trial were shown on a black screen. Height and width of the images were between 10° to 17° visual angle (8.4–15.2 cm) depending on the image (cf. Figure 4). Instructions for the tasks were shown in white letters on the black screen. 
Reflectances
Colors in the scenes were rendered based on the reflectances of glossy Munsell chips. These reflectances were retrieved from the database of the Joensuu Color Group (Kohonen, Parkkinen, & Jaaskelainen, 2006; Parkkinen, Hallikainen, & Jaaskelainen, 1989), which is now available via the University of Eastern Finland (http://www.uef.fi/fi/spectral). 
Test colors:
Color constancy was measured for 12 colors (“test colors”), one prototypical color for each of the red, yellow, green, and blue categories and two colors at the category boundaries. 
For this purpose, Munsell chips have been chosen as test colors based on previous measurements of color categories for German (Olkkonen et al., 2010) and English observers (Witzel, Flack et al., 2013; Witzel, Sanchez-Walker et al., 2013). Hues were determined according to prototypes and category boundaries of red, yellow, green, and blue in those studies. Munsell lightness has been chosen according to the prototypes of the color categories. A lower Munsell chroma than in those studies aforementioned had to be used in order that reflected light under both illuminations fitted within the monitor gamut and was sufficiently far away from the monitor gamut to still allow for adjustments. 
Apart from the 12 chromatic colors, there were also three gray stimuli: a dark, middle, and light gray. The chromaticities of these gray stimuli corresponded with those of the background (see “Background” as follows). For this reason, they were used to explain the task in practice trials, and to double-check that observers did the task correctly during experimental trials. However, since the adjustments of these gray stimuli could be done by comparison with the background alone, and did not necessarily require a comparison across illuminations, they were excluded from the main results (Additional results involving the gray stimuli are provided in the Supplementary Material, e.g., Figures S1–S3). 
The Munsell chips resulting from the above criteria are given in Table 1. The table also provides the CIE1931 chromaticity coordinates and luminance of the simulated Munsell chips under either illumination. 
Table 1
 
CIE1931 chromaticity coordinates of simulated Munsell chips under target illumination. Notes: WP = white point; BG = background.
Table 1
 
CIE1931 chromaticity coordinates of simulated Munsell chips under target illumination. Notes: WP = white point; BG = background.
Distractor colors:
The other 11 colored objects in a scene, the distractors, were randomly drawn from Munsell Chips that had different hues than the test colors (i.e., 2.5R, 7.5R, 10R, 5YR, 7.5YR, 5Y, 10Y, 5GY, 7.5GY, 10GY, 5G, 7.5G, 10G, 2.5BG, 5BG, 10BG, 2.5B, 5B, 7.5B, 10B, 5PB, 10PB, 2.5P, 5P, 7.5P, 10P, 2.5RP, 5RP, 7.5RP). The candidate distractors varied in lightness from Munsell Value 3 to 8. Munsell Chroma of distractors was determined to be highest at a given lightness level, while being equal across hues at that lightness level. As a consequence of this criterion, Munsell Chroma was set to 6 for chips at Munsell Value 4 to 7, and to 4 at Munsell Value 3 and 8 (for further discussion, see Witzel et al., 2015, p. 20ff). 
Background:
In all scenes, the color of the background (“BG” in Table 1) was set to the color of the middle gray Munsell chip N5 under the respective illumination. 
Illuminants
Metamer mismatch volumes also depend on the illuminant spectra. For the purpose of the present study, we assumed that observers have prior knowledge about how colors change during changes of light in their natural environment, i.e., along the daylight locus. For this reason, we used illuminants with spectra that correspond to daylight. 
Figure 5 illustrates the illuminants. The first illuminant was CIE illuminant D50, which simulates natural daylight at a Correlated Color Temperature (CCT) of 5000K following Judd and colleagues (Judd et al., 1964). The second illuminant was a black body simulator at a correlated color temperature (CCT) of 12,000 K. 
Figure 5
 
Illuminations. Panel a shows the spectral power distribution of the two illuminants. Panel b illustrates the location of the chromaticities of the two illuminants (yellow and blue disks) relative to the daylight locus (thick line). Note that the chromaticities of the illuminants were very close to the daylight locus.
Figure 5
 
Illuminations. Panel a shows the spectral power distribution of the two illuminants. Panel b illustrates the location of the chromaticities of the two illuminants (yellow and blue disks) relative to the daylight locus (thick line). Note that the chromaticities of the illuminants were very close to the daylight locus.
Procedure
The main experiment involved the simultaneous matching task and the simple color naming task. Each session began with an oral overview of the task given by the experimenter. Detailed standardized instructions were then presented on the computer screen. After reading the instructions, participants completed practice trials of the matching task, followed by the main trials of the matching task, and then the color naming task. Reading instructions and completing practice trials allowed for preliminary adaptation to the colors of the stimulus images. Each session took overall about 1 hr, of which the matching task lasted about 30 to 45 min and the color naming task less than 5 min. In order to measure consistencies across repeated measurements, each observer completed two such sessions on different days. Supplementary color naming was conducted in a third session after completion of the main experiment. 
Simultaneous matching task
The simultaneous matching task consisted of adjusting the color of an object in one image of a scene so that it corresponded to the color of the same object in a second image of the same scene, but that was differently illuminated (color constancy condition). In the control condition the two scenes to be matched had identical illumination. 
Figure 3 illustrates the stimulus display of the matching task when the images showed the scene under different illuminations. Observers were asked to adjust the color of the test surface in one of the scenes (initially marked by a blinking round dot), up to the point where it—given the differences in illumination—was perceived as the same color as the target surface in the other scene (initially marked by a blinking square dot). The general instructions in the introduction of the experiment were: 
 

“During this experiment two photographs will simultaneously be shown on the screen. The photographs can either be identical or differ in their lighting. In one of the photographs, the color of an object is changed. Your task is to change it back to its original color. It is hereby important that the object seems to fit in the scene with respect to the lighting.”

 
Color adjustments:
To adjust the color of the test patch, observers could press one of four cursor keys to add yellow, blue, green, and red to the test color. The changes in color were translated into polar coordinates (azimuth and radius) in CIELUV space. CIELUV was chosen because it is coarsely equidistant across space, allowing colors to be changed more homogeneously. Since the color patches in the scene were three-dimensional objects, their color distributions included small variations in lightness and chroma due to fine-grained shading. For this reason, the color changes were implemented through a polar adjustment technique (Hansen, Olkkonen, Walter, & Gegenfurtner, 2006). 
The precise implementation of this technique followed the one described in detail by Witzel, Valkova, Hansen, and Gegenfurtner (2011). In this technique, color adjustments are converted into rotations in azimuth and expansion (or compression) along the radius of the color coordinates. In this way, there exists an achromatic point at which all the colors in the color distribution are achromatic and the object is completely gray scale. Moving away from this achromatic point is implemented by multiplying the radius of all the colors and shifting them in the hue direction of the adjustment. 
The cursor keys were used for the color adjustments (left arrow: more green, right arrow: more red; up arrow: more yellow, down arrow: more blue). Participants could alternate between coarse and fine adjustments using the control key for coarse and the space bar for fine adjustments. During coarse adjustments colors were changed by four CIELUV units and observers could surf continuously along a color dimension by keeping the respective key pressed down. In fine adjustment mode, colors were changed by one CIELUV unit for each single keypress and no “surfing” was possible. Pressing return confirmed an adjustment and continued with the next trial. This was only possible if participants had done a fine adjustment before confirming the adjustment. 
Practice trials:
The color matching task began with three practice trials. The experimenter was present during instructions and practice trials to respond to any questions and to check whether the observer understood the task. In particular, only gray test colors were used in these practice trials. For gray colors, the adjusted chromaticity of the patches should be the same as the background of the respective scene when the task was done correctly. This allowed the experimenter to check whether observers understood the task and to prevent them from making light instead of surface matches due to a misunderstanding of the task. 
Main trials:
In the main part, observers adjusted each of the 15 test colors under each of four conditions of illumination. The conditions of illumination consisted of two control conditions, in which illuminations were the same (yellow-yellow, and blue-blue), and two color constancy conditions, in which illuminations were different (yellow test and blue comparison display, and vice versa). The resulting 60 trials were presented in random order and divided in three blocks of 20 trials each. Participants could take a short break in between two blocks if desired. For each of the two sessions a participant completed, a different set of randomized scenes was produced. This was the only difference between the two sessions. 
Simple naming task
The observer was asked to name the test colors using the numeric keypad whose keys corresponded to the eleven basic color terms (red, orange, yellow, green, blue, purple, pink, brown, white, gray, and black). There were two blocks of naming, each one showing colors as rendered under one of the two illuminations. The presentation order of the blocks was randomized. 
Stimulus colors were presented as uniformly colored disks in the center of the screen. The background was uniformly colored in middle gray (N5) rendered under the respective illuminant. Stimulus colors were the same as the test colors in the matching task, excluding middle gray (N5) as it would have the same color as the background. Each of these 14 stimulus colors was presented three times in random order in each of the two blocks, leading to a total of 84 trials, 42 for each illumination. 
Supplementary naming task
Observers were presented the stimulus displays as used in the color constancy task, and a blinking black dot indicated which color was to be named in which scene (cf. Figure 3). The response mode was the same as for the simple color naming task previously noted. 
There were six blocks involving two random stimulus sets with three repetitions each. Within each block all 60 stimulus displays (15 colors × 4 illumination conditions; see matching task) of a stimulus set were presented once in random order. When observers had participated in the two sessions of the color matching task, the two stimulus sets from the color matching sessions were used in the supplementary naming. 
Computational models
Simulations of natural surfaces
In order to assess how surfaces in the natural environment change due to the illuminant change, we simulated prototypical natural surfaces. For this purpose, we determined the first three principal components of the databases of 404 natural reflectances of Westland, Owens, and Shaw (2000). These three principal components explained 98% of the variance across the 404 reflectance spectra (cf. Figure 6). Moreover, these three principal components were almost the same (r = 0.97–0.99) as those calculated for another database with natural reflectances (Castellarin, 2000). We took this as an indication that the first three principal components can be used as basis functions to roughly approximate natural reflectances. 
Figure 6
 
Principal components of the natural reflectances measured by Westland et al. (2000). The three curves show the first (blue), second (green), and third (red) principal components. The percentage (98%) reports the variance of the natural reflectances explained by the three principal components.
Figure 6
 
Principal components of the natural reflectances measured by Westland et al. (2000). The three curves show the first (blue), second (green), and third (red) principal components. The percentage (98%) reports the variance of the natural reflectances explained by the three principal components.
We obtained prototypical natural reflectances by weighing the principal components so that the resulting reflectances produced the LMS signals of our test colors for each of the two illuminants. Those simulated natural reflectances were used to predict how the LMS signal changes from one to the other illumination. The green disks in Figure 2 show the colors predicted based on natural reflectances. The colors predicted by the natural reflectances are very similar to those predicted by the Munsell chips (red disks), which is in line with the observation that Munsell chips can be used to roughly approximate natural surfaces (Jaaskelainen, Parkkinen, & Toyooka, 1990). 
Calculation of metamer mismatch volumes
For an LMS signal of a surface under illuminant 1, we would like to find the hull of all possible LMS signals this surface could exhibit under illuminant 2, that is to say the surface of the metamer mismatch volume. Logvinenko et al. (2014) proposed an algorithm for finding such points. 
Given two illuminants for an observer with three sensory signals (LMS), we combine them into a single six-dimensional system. The hull of the object color solid (volume of all possible colors of reflecting surfaces) of this six-dimensional system is described by a special type of rectangular reflectance functions that have values of 0 (no reflection) or 1 (100% reflection) with a maximum of five transitions between these values, also called 5-transition reflectances (for illustration see Figure 7). The projection of this hull onto a fixed set of coordinates under illuminant 1 is the surface of the mismatch volume for these coordinates that arises when changing to illuminant 2. 
Figure 7
 
Illustration of a 5-transition reflectance. This kind of reflectance defines the hull (“outer skin”) of the metamer mismatch volumes.
Figure 7
 
Illustration of a 5-transition reflectance. This kind of reflectance defines the hull (“outer skin”) of the metamer mismatch volumes.
The algorithm works as follows: It searches for 5-transition reflectances metameric to a given stimulus under illuminant 1, and then relights them under illuminant 2. The resulting stimulus is guaranteed to lie on the outer hull of the mismatch volume, because it also lies on the surface of the six-dimensional object color solid. The algorithm then randomly generates a large number of such reflectances to provide a reasonable approximation of the surface. 
To calculate the metamer mismatch volumes and areas, the resulting sample of LMS signals on the surface has been converted into CIELUV (and CIELAB, respectively). The volume of the convex hull of the resulting coordinates was calculated to obtain the metamer mismatch volume. 
Since lightness was kept constant in the color constancy measurements, we calculated what we call metamer mismatch areas for the main analyses reported as follows. Metamer mismatch areas are the size of the projection of the metamer mismatch volumes on the plane spanned by the chromatic axes. More precisely, the complex hull of the two-dimensional projection of the color coordinates (in CIELUV or CIELAB, respectively) were calculated. Note, however, that main results are practically the same when using volumes instead of areas. 
Cone ratios
Similar to the approach of Nascimento et al. (2004), we estimated the target color under illumination changes based on the cone ratios. For each stimulus display of each individual and each session, we calculated cone ratios for all 11 distractors and the background. As reported by Foster and Nascimento (1994), cone ratios were approximately constant for all surfaces. Based on those 12 ratios, we determined 12 estimations of the target LMS signal. We converted these LMS values into CIELUV and averaged them to obtain a cone ratio based estimation of the target color of the respective display and stimulus set. To make sure that results do not depend on the use of CIELUV, we also did these calculations in CIELAB. 
Results
Quantification of color constancy
The white disks in Figure 2 illustrate average adjustments for each observer. In classical approaches to measure color constancy the observers' color constancy performance is evaluated through a color constancy index that compares the color identified by the observer across illuminations with a target color. The target color is defined through the researcher's assumptions about a target surface. Often these assumptions consist of the reflectance properties of standard Munsell color chips (red disks in Figure 2). However, the observer cannot know the researcher's assumptions and the target color is only one possible point within the metamer mismatch volume. To account for that, we calculated three indices to assess the color constancy in the variation of color matches (see Figure S3 in Supplementary Material for illustration). 
In a first approach we evaluated color constancy in a way that is similar to classical approaches and we defined a measure relative to a target color. Given metamer mismatching all colors in the metamer mismatch volumes are sensible candidates. At the same time, it is possible that observers have knowledge about color changes based on their prior experience with illumination changes. In this case, it seems unlikely that the observers' target color would correspond to artificial surfaces, such as Munsell chips. Instead, observers are most likely to have experience with surfaces in the natural environment. Hence, a color prediction based on a typical natural surface should be the most likely target color for the adjustments. For this reason, we determined the target color based on the simulated natural reflectance under the respective illuminant. The green disks in Figure 2 show the target colors resulting from these prototypical natural reflectances. In order to quantify color constancy, we calculated the distance between the grand average adjustments and the predictions based on typical natural surfaces (cf. first column of Figure S3). This approach of assuming a target color allows for comparison to classical approaches to color constancy, which have assumed a fixed target color rather than a metamer mismatch volume of possible colors. 
However, in the light of metamer mismatching, one could take a more radical stance, and not assume any target color to measure color constancy. In the absence of prior knowledge about the reflectance, it cannot be assumed that the observer aims at reproducing a particular target color under the second illumination because the observer only knows the color under the first illumination, not the reflectance. 
In fact, the white disks in Figure 2 show that adjustments vary considerably across individual observers and do not cluster around a single target color. As illustrated by Figure 2b, the main variation of the adjustments (gray line) did not always spread between the original color under the first illumination (black disks in Figure 2) and the target color under the second illumination (green disks); for more examples see Figure S1 and S2 in the Supplementary Material
This observation supports the idea that it might be wrong to assume a fixed target color. Instead of one fixed target color, each observer might have their own assumption about candidate reflectances and possible target colors within the metamer mismatch volume. In this case, individual differences of assumptions about target colors within the metamer mismatch volume would determine the variation of color adjustments across observers. Hence interindividual differences in matches should correlate with metamer mismatch volumes. To test this idea, we calculated the variation of color matches across individual observers as a second approach to quantify color constancy (cf. second column of Figure S3). 
Finally, we determined the intra-individual variation of color matches across repeated measurements as a third approach to quantify color constancy (cf. third column of Figure S3). This measure directly assesses how stable an observer is in identifying a color across illuminations. If observers do not have determinate prior assumptions about reflectances and target colors the variation of each individual's adjustments would indicate how uncertain observers are in their estimations of target colors. If the uncertainty of the adjustments is due to metamer mismatching, intra-individual variations should also be correlated to metamer mismatch volumes. 
To evaluate the variation of color constancy across colors, we determined the three measures for the color constancy (different illuminations) and for the control conditions (same illumination). The comparison of the color constancy condition with the control condition indicates the observer's uncertainty about corresponding colors that is specific to the change of illuminations. 
To summarize, we obtained three measures of color constancy: The deviation from the target prediction by the natural surface, the interindividual variability and the intra-individual variability across repeated measurements. The deviation from the natural surface reflects the precision of identifying a typical natural surface under changes of illumination, the interindividual variability reflects the consensus in identifying a color across illuminations, and the intra-individual variability is indicative of the stability of each observer's individual color identification across illumination change. All three measures of color constancy varied considerably across different colors (for details see section “Quantification of Color Constancy” in the Supplementary Material). This variation across colors was not the same for illumination changes from yellow (test scene) to blue (comparison scene) and from blue to yellow (cf. Table S1). We compared the variation of these three measures across colors with metamer mismatch areas, sensory singularities, color category membership, and predictions based on cone ratios. 
Metamer mismatch volumes
Like color constancy, the size of metamer mismatch areas varied strongly across colors. To test for a relationship between metamer mismatching and color constancy, we calculated the correlations of each of the three color constancy indices with the size of the metamer mismatch areas corresponding to each of the LMS signals of the test colors. There were overall 24 data points for the 12 test colors under the two illuminations. 
Figure 8ac illustrates the main results. The size of metamer mismatch areas explains 53% [r(22) = 0.73, p < 0.0001] of the deviation of the adjustments from the prediction based on the natural surface (Figure 8a), 60% [r(22) = 0.78, p < 0.0001] of the variation of adjustments across observers (Figure 8b), and 49% [r(22) = 0.70, p = 0.0001] of the consistency of adjustments across repeated measurements (Figure 8c). These correlations indicate that the estimation of color changes across illuminations are directly related to the size of the metamer mismatch areas. 
Figure 8
 
Correlations between metamer mismatch areas and color constancy. The first row (Panels a–c) shows simple correlations, and the second row (Panels d–f) shows partial correlations controlling for the performance in the control condition. In all panels of the first row, the x-axis shows the size of metamer mismatch areas. The y-axis indicates Euclidean distances in CIELUV-space. In Panel a, these distances correspond with the differences from the color predicted by simulated natural surfaces; in Panel b they refer to the differences of individual adjustments from the average across individuals and in Panel c to the differences between the two repeated measurements. In the second row, the x-axis and the y-axis show the residuals from the regression that accounts for the performance in the control condition. Each symbol corresponds to one of the 12 colors under one of two illumination changes. Disks refer to illumination changes from yellow to blue, diamonds from blue to yellow. The correlations between the distances along the y-axes and the metamer mismatch volumes (x-axes) are given in the upper right corner of each panel (***p < 0.001) and are illustrated by the regression line (in gray). Note that results are practically the same when controlling for variance of adjustments without illumination change (Panels df), and were highly similar when doing these calculations in CIELAB instead of CIELUV space (Figure S4 of the Supplementary Material).
Figure 8
 
Correlations between metamer mismatch areas and color constancy. The first row (Panels a–c) shows simple correlations, and the second row (Panels d–f) shows partial correlations controlling for the performance in the control condition. In all panels of the first row, the x-axis shows the size of metamer mismatch areas. The y-axis indicates Euclidean distances in CIELUV-space. In Panel a, these distances correspond with the differences from the color predicted by simulated natural surfaces; in Panel b they refer to the differences of individual adjustments from the average across individuals and in Panel c to the differences between the two repeated measurements. In the second row, the x-axis and the y-axis show the residuals from the regression that accounts for the performance in the control condition. Each symbol corresponds to one of the 12 colors under one of two illumination changes. Disks refer to illumination changes from yellow to blue, diamonds from blue to yellow. The correlations between the distances along the y-axes and the metamer mismatch volumes (x-axes) are given in the upper right corner of each panel (***p < 0.001) and are illustrated by the regression line (in gray). Note that results are practically the same when controlling for variance of adjustments without illumination change (Panels df), and were highly similar when doing these calculations in CIELAB instead of CIELUV space (Figure S4 of the Supplementary Material).
In order to control for variation of color adjustments that is not specific to color constancy, we calculated partial correlations that controlled for adjustments without illumination change in the control condition (cf. Figure S3df). These partial correlations had almost the same size as the aforementioned simple correlations and each of them explained more than 50% of the variance (Figure 8df). This is due to the fact, that none of the three measures (deviation from natural surface prediction, intra-, and interindividual variability) in the control condition were correlated with metamer mismatch areas (all ps > 0.29). Moreover, correlations were similar and significant (all ps < 0.001) when representing colors in CIELAB instead of CIELUV-space (see Figure S4 and Table S2 in the Supplementary Material). These results show that the correlations do not depend on variations in discriminability due to inhomogeneities or imprecisions of CIELUV color space or to other factors unrelated to the illumination change. It is highly unlikely that we should have obtained significant correlations in all 12 tests (for summary cf. Table S2 in the Supplementary Material) purely by chance. 
Sensory singularities
In contrast to metamer mismatch areas, there was no significant correlation between our measures of color constancy and sensory singularities. We determined the singularity index as explained in detail by Witzel et al. (2015). Singularity indices are specific for a given reflectance. Hence, there is only one singularity value for changes from yellow to blue, and from blue to yellow illuminations (cf. Figure S5 in the Supplementary Material). 
This singularity index reflects the degree to which the transformation of the illuminant LMS signal to the reflected LMS signal due to the reflectance is singular, i.e., less than three-dimensional. The higher the singularities, the more reliable and predictable the LMS signal under illumination changes. If observers used this predictability in their color constancy performance, the singularity index should be negatively correlated with our three measures of color constancy. 
Table S3 in the Supplementary Material reports the correlations between singularity index and color constancy measures. None of the correlations was significant. These observations undermine the idea that singularities have a direct effect on color constancy. 
Color categories
According to previous studies (Olkkonen et al., 2009; Olkkonen et al., 2010; Witzel, Flack et al., 2013), the consistencies of color categorization across observers and across illuminations are highly positively correlated. Consistencies are calculated as the relative frequency of same classifications across observers (category consensus) and across illuminants (category constancy), respectively. Figure 9 illustrates category consensus (Panel a) and category constancy (Panel c) with the data from the supplementary naming task. We also determined the intra-individual category consistency, which is the consistency across repeated measurements averaged across observers (cf. Figure 9b). Detailed results about category consistency are provided in the section “Consistency of Color Categories” in the Supplementary Material
Figure 9
 
Color categories and constancy. In all panels the x-axis lists the 12 stimuli, and the groups of bars refer to the four color categories (red, yellow, green, and blue). The center bars in each group correspond to the typical, the other bars to the boundary colors. Panels a, b, and c show category consensus (consistency of categorization across observers), intra-individual category consistency (consistency of categorization across repeated measurements), and category constancy (consistency of categorization across illuminations) in the supplementary naming task. Panel d illustrates perceptual color constancy performance in the asymmetric matching task by showing the deviation from the natural surface prediction (for other measures of color constancy see Figures S9, S10). In all panels, error bars show standard errors of mean. Horizontal lines and symbols indicate significance of t tests across participants, that compared performance for the typical color with the average performance of the two other colors (*p < 0.05; ***p < 0.001; ns = nonsignificant). The higher the bars in Panels a, b, and c the higher category consistency, and the lower the bars in Panel d the higher color constancy (categorical pattern). Note that category consistencies (Panels ac), but not perceptual constancy (Panel d) showed categorical patterns.
Figure 9
 
Color categories and constancy. In all panels the x-axis lists the 12 stimuli, and the groups of bars refer to the four color categories (red, yellow, green, and blue). The center bars in each group correspond to the typical, the other bars to the boundary colors. Panels a, b, and c show category consensus (consistency of categorization across observers), intra-individual category consistency (consistency of categorization across repeated measurements), and category constancy (consistency of categorization across illuminations) in the supplementary naming task. Panel d illustrates perceptual color constancy performance in the asymmetric matching task by showing the deviation from the natural surface prediction (for other measures of color constancy see Figures S9, S10). In all panels, error bars show standard errors of mean. Horizontal lines and symbols indicate significance of t tests across participants, that compared performance for the typical color with the average performance of the two other colors (*p < 0.05; ***p < 0.001; ns = nonsignificant). The higher the bars in Panels a, b, and c the higher category consistency, and the lower the bars in Panel d the higher color constancy (categorical pattern). Note that category consistencies (Panels ac), but not perceptual constancy (Panel d) showed categorical patterns.
To test whether those previous observations held for our data, we calculated the correlations between the three kinds of category consistency across the 12 colors for the data of our two color naming tasks (cf. Figures S6–S8 in the Supplementary Material). There was a significant correlation between category consensus (Figure 9a) and category constancy (Figure 9c) in the supplementary naming task [r(10) = 0.64, p = 0.02; Figure S8a, b]. In the simple naming task, this correlation was not significant [r(10) = 0.35, p = 0.26; Figure S7a, b]. This might have been due to low statistical power. For this reason, we calculated correlation coefficients for the data of each individual observer in the simple naming task, converted them to Fisher's z-transform (Fisher, 1915), and tested in a one-tailed t test across the 20 observers whether they were significantly above zero, as expected based on the previous studies. Even though the effect is small, average r(10) = 0.11, it was significant, t(19) = 1.8; p = 0.04. These results replicate previous findings (Olkkonen et al., 2009; Olkkonen et al., 2010). However, these correlations are much lower than those found previously by Olkkonen and colleagues. 
A much clearer relationship was found between category constancy (Figure 9c) and intra-individual category consistency (Figure 9b). Category consistency and category constancy were highly correlated in the simple naming task [r(10) = 0.63, p = 0.03; Figure S7c, d] and in the supplementary naming task [r(10) = 0.87, p = 0.0002; Figure S8c, d]. These results indicate that categorization across illuminations reflect more generally the certainty of category membership across colors rather than specifically the consensus of categories across observers, as suggested by those previous studies (Olkkonen et al., 2009; Olkkonen et al., 2010; Witzel, Flack et al., 2013). 
We then tested the idea that category consensus and category constancy are higher for typical than for boundary colors, as suggested previously (Olkkonen et al., 2010; Witzel, Flack et al., 2013; Witzel et al., 2015). We also examined whether intra-individual category consistency was higher for typical than for boundary colors. 
Figure 9ac illustrates consistencies in the supplementary naming task. If consistencies were higher for typical than for boundary colors, the center bars of each category (red, yellow, green, blue) should be higher than the respective two bars for the boundary colors (categorical pattern). To test this, we compared the consistency of the typical color to the average consistency of the two boundary colors in paired, two-tailed t tests across participants (cf. symbols in Figure 9ac). 
For green and blue, category consensus, intra-individual category consistency and category constancy were higher for typical than for boundary colors [all t(20) > 5.0, all ps < 0.001]. Yellow showed this pattern for category constancy, t(20) = 2.4, p = 0.03, but not for category consensus, t(20) = 0.9, p = 0.37. Average category consensus, intra-individual category consistency and category constancy for the red category and average intra-individual consistency for the yellow category went in the right direction, but the differences were not significant (ps > 0.15). 
Similar results were found in the simple naming task (cf. Figure S6a–c). Although the red category did not show the predicted pattern at all; typical green and blue showed higher category consensus and category constancy than the respective boundary colors [all t(19) > 3.2, all ps < 0.01]; for yellow this was true for category constancy only [t(19) = 2.9, p = 0.008]. 
Repeated-measures analyses of variance across all categories confirmed that consistencies were higher for typical than for boundary colors for all three kinds of consistencies (consensus, intra-individual, and constancy) and in both naming tasks (for details see Table S4 in the Supplementary Material). These results further confirm the observation that category constancy, and consistency in general, is higher around the typical colors in the center of the categories than at the category boundaries (in particular, figure 8 in Olkkonen et al., 2010). 
The most important question of the present study was whether this categorical pattern is due to high perceptual color constancy of category prototypes. Figure 9d illustrates color constancy across colors when color constancy is measured by the deviation from the natural surface prediction in the color matching task. If color constancy is higher for typical than for boundary colors, the center bars of each category should be lower than the two bars of the boundary colors (categorical pattern). We tested the difference between center bars and the averages of the two other bars with paired t tests across participants (symbols in Figure 9d). None of these t tests was significant (ps > 0.13). The same was true for our two other measures of color constancy, i.e., interindividual and intra-individual variation of matches across illuminations. In the Supplementary Material (“Peaks of color constancy at prototypes”), we provide additional analyses to test whether color constancy is highest for the colors whose hue corresponds to category prototypes. However, color constancy did not peak at the hues of the category prototypes in those tests, either. Taken together, these results undermine the idea that color constancy increases toward the prototypes of color categories. 
Finally, we tested for the existence of a more general relationship between the consistency of categorization and our perceptual measures of color constancy from the color matching task. For this purpose, we calculated correlations across the 24 colors between, on the one hand, the three perceptual measures from the matching task, and on the other hand the measures of category consistency, namely category consensus (interindividual), category consistency (intra-individual), and category constancy (across illuminations). Details may be found in the section “Color Categories and Color Constancy” of the Supplementary Material
If category consistency is positively related to perceptual color constancy, there should be negative correlations with our three measures of color constancy because color constancy is higher the lower these measures are. Neither the interindividual consensus nor intra-individual consistency nor the constancy of categories were correlated with interindividual variation, intra-individual variation, or the deviation from the target in the matching task, no matter which naming task was used for the measurements of categories (all ps > 0.1). The only exception was a positive correlation between interindividual variation of matches and category consensus in the supplementary naming task, r(22) = 0.43; p = 0.03. This correlation contradicts the predicted negative correlations. Moreover, this correlation is not significant after a Bonferroni correction for repeated testing involving the three measures of constancy and two kinds of naming tasks (simple and supplementary). Hence, it might be due to multiple testing. The absence of negative correlations across all three measures of color constancy indicates that there is no clear relationship between category membership and the ability to recognize colors across illuminations, as measured in the matching task. 
Cone ratios
Figure 10 compares the predictions of observers' adjustments through the reflectances of simulated natural surfaces (green bar), of Munsell chips (red bar), and through cone ratios (blue bar). It also shows how much observers deviated in their adjustments from their average adjustments in the control condition with the same illumination (dark gray bar) and in the constancy condition with different illuminations (light gray bar). The higher the bar, the higher the deviation of the predicted target color. Paired t tests across the 24 colors (12 in each condition) were applied to compare the distances (i.e., the bars). 
Figure 10
 
Predictions of target colors. Bars indicate the average Euclidean distance in CIELUV from a target color. “Same” refers to adjustments under the same illumination (control condition), where the target color is unambiguously defined. The other bars correspond with adjustments under different illuminations (constancy condition). “Ave” refers to the grand average of adjustments; for “Nat” and “Mun” the assumed target color is defined by the prediction based on the natural reflectances and the reflectances of Munsell chips, respectively. CR indicates average deviations from predictions through cone ratios. N reports the total number of colors (12 × 2 conditions). Error bars show standard errors of the mean across stimuli and symbols indicate significance of paired t tests across stimuli (***p < 0.001; ns = nonsignificant). Figure S11 in the Supplementary Material shows the same data, but calculated in CIELAB space. Note that deviations were similar for the predictions based on Munsell chips and for natural reflectances, and they were highest for predictions based on cone ratios.
Figure 10
 
Predictions of target colors. Bars indicate the average Euclidean distance in CIELUV from a target color. “Same” refers to adjustments under the same illumination (control condition), where the target color is unambiguously defined. The other bars correspond with adjustments under different illuminations (constancy condition). “Ave” refers to the grand average of adjustments; for “Nat” and “Mun” the assumed target color is defined by the prediction based on the natural reflectances and the reflectances of Munsell chips, respectively. CR indicates average deviations from predictions through cone ratios. N reports the total number of colors (12 × 2 conditions). Error bars show standard errors of the mean across stimuli and symbols indicate significance of paired t tests across stimuli (***p < 0.001; ns = nonsignificant). Figure S11 in the Supplementary Material shows the same data, but calculated in CIELAB space. Note that deviations were similar for the predictions based on Munsell chips and for natural reflectances, and they were highest for predictions based on cone ratios.
Observers' adjustments were similarly close to predictions through Munsell chips and predictions through natural surfaces, t(23) = 0.1, p = 0.92. However, predictions based on cone ratios were further away from observer's adjustments than Munsell chips, t(23) = 4.0, p = 0.001. Results were largely the same when deviations and cone ratio predictions were determined in CIELAB instead of CIELUV space (cf. Figure S11 in the Supplementary Material). 
Depending on which distractor colors are on a given display, predictions through cone ratios vary. If observers estimate the target color through comparison with the distractors the variation in cone ratios should predict the variation of observers' adjustments. We tested this idea. As an estimate of intra-individual variability, we determined for each observer and each stimulus display how strongly cone ratio estimates varied across the two different stimulus sets of each session. As an estimate of interindividual variability, we determined how much the average cone ratio estimate of each individual differed from the overall average cone ratio estimate across all individuals. As with metamer mismatch volumes, we calculated correlations across the 24 colors. 
Neither the variation of cone ratio predictions across the stimulus displays within each observer nor the variation of cone ratio predictions across observers were correlated with the intra- and interindividual variation of observers' adjustments, respectively [r(22) = 0.31, p = 0.14; r(22) = 0.13, p = 0.54]. When recalculated in CIELAB, a correlation was found for intra-individual [r(22) = 0.46, p = 0.02], but not interindividual variation [r(22) = 0.20, p = 0.35]. We also found some evidence that variation of cone ratio predictions across stimulus sets is related to metamer mismatch volumes. In CIELUV, correlations were found for the variation of cone ratio predictions across the stimulus sets of each observer [r(22) = 0.41, p = 0.046], but not across observers [r(22) = 0.29, p = 0.18], and stronger correlations were found in CIELAB [r(22) = 0.67, p = 0.0003; r(22) = 0.53, p = 0.007]. However, the correlation between metamer mismatch volumes and intra- and interobserver adjustments still holds when controlling for the variation in cone ratio predictions [r(22) = 0.66, p < 0.001; r(22) = 0.78, p < 0.0001]. 
Finally, according to Foster and Nascimento (1994), the more cone ratio predictions differed from Munsell chip predictions, the more observers' adjustments differed from predictions through Munsell chips. We could not find evidence for this relationship in our data, neither when calculated in CIELUV [r(22) = 0.14, p = 0.51] nor in CIELAB [r(22) = 0.06, p = 0.79]. 
Other factors
Although the correlations between metamer mismatch areas and our measures of color constancy were high, they did not completely explain the variation of the adjustments in the asymmetric matching task. If the adjustments across illuminations depended completely on the uncertainty due to metamer mismatching, then all adjustments should fall within the metamer mismatch volume. This is not the case for our data (cf. white disks in Figure 2; for all data see Figures S1 and S2). Instead, the adjustments in our measurements were shifted away from the metamer mismatch volume (colored area) towards the LMS signal under the original illumination (black dot in Figure 2, and Figures S1 and S2). Only 25% to 95 % depending on stimulus and condition (60% on average) lay within the metamer mismatch area (for details, see Table S6). 
The shift may be explained by partial adaptation. The simultaneous presentation of the scene under two illuminations implies that observers partially adapt to the colors of the illumination in each of the test and the comparison image. As a result of partial adaptation, observers' perception is shifted midway between the two illuminations. This shift should not depend on the test colors of adjusted patches, but only on the illuminations. 
Such a shift toward the original LMS signal increases the absolute distance of the adjusted color from a target color within the metamer mismatch volume, such as the LMS signal of the natural surface prediction. Since the differences between test color and target color vary across test colors, a constant shift due to partial adaptation should also differentially affect color adjustments and color constancy performance. 
To test this idea we determined “target-shifts,” which we defined as the distances between test colors and the corresponding natural surface predictions in CIELUV. We calculated correlations between these target-shifts, the size of metamer mismatch areas and our three measures of color constancy. Target-shifts were correlated with both metamer mismatch areas [r(22) = 0.67, p = 0.0003] and the measures of color constancy [Deviation from natural surface prediction: r(22) = 0.43, p = 0.04; interindividual variation: r(22) = 0.41, p = 0.049; intra-individual variation: r(22) = 0.53, p = 0.008]. When recalculated in CIELAB, correlations did not reach significance. For details, see subsection “Target Shifts” in the Supplementary Material
Furthermore, it has been found that the size of metamer mismatch volumes increases with lightness (for the lightness levels used here, i.e., Munsell values 4–8) and decreases with chroma (Zhang, Funt, & Mirzaei, 2015, 2016). We determined lightness and chroma of our stimuli as L* and as the radius in CIELUV, respectively. Then we calculated the correlations between lightness and chroma on the one hand, and metamer mismatch areas and our measures of color constancy on the other hand. Lightness and chroma were positively correlated with the size of metamer mismatch areas [r(22) = 0.53; p = 0.008 and r(22) = 0.57, p = 0.004]. While the positive correlation with lightness is in line with previous observations, the positive correlation for chroma is not (Zhang et al., 2015, 2016). Lightness was also positively correlated to deviations from the natural target predictions [r(22) = 0.56, p = 0.004] in the color constancy task. The correlation between lightness and inter- and intra-individual variation just missed significance [r(22) = 0.38, p = 0.07 and r(22) = 0.40, 0.06]. Chroma was positively correlated with deviations from natural surface predictions [r(22) = 0.59, p = 0.003] and interindividual variation [r(22) = 0.41, p = 0.04], and almost significantly with intra-individual variation [r(22) = 0.37, p = 0.07]. For details, see subsections “Lightness” and “Chroma” in the Supplementary Material
However, the aforementioned correlations involving target shifts, lightness, and chroma are mainly due to the three stimuli of the yellow category (cf. Figures S12–S14 in the Supplementary Material). In any case, the correlations between metamer mismatch areas and our three measures of color constancy still occur in partial correlations that controlled for lightness, chroma, and target shifts [Deviation: r(22) = 0.43, p = 0.04; interindividual: r(22) = 0.41, p = 0.049; intra-individual: r(22) = 0.53, p = 0.008]. Similar correlations were found in CIELAB. For details, see Table S10 in the Supplementary Material
In a multiple regression, metamer mismatch areas, lightness, chroma, and target shifts together explained 68% of the deviations of matches from natural surface predictions [F(4, 19) = 10.0, p = 0.0002], 64% of interindividual variation [F(4, 19) = 8.5, p = 0.0004], and 52% of intra-individual variation [F(4, 19) = 5.1, p = 0.006]. In contrast, adjustments in the control condition (without illumination change), singularity index, cone ratios, category constancy, category consensus, and category consistency in the supplementary naming task explained only 33% of deviations from target predictions, 15% of inter-, and 16% of intra-individual variation in multiple regressions, and these multiple regressions were not significant (all ps > 0.26). All 10 factors together (metamer mismatch areas, lightness, chroma, target shifts, control condition, singularity index, cone ratios, category constancy, consensus, and consistency) explained 80% of target deviations [F(10, 13) = 5.2, p = 0.004], 81% of inter- [F(10, 13) = 5.6, p = 0.003] and 57% of intra-individual variation [F(10, 13) = 1.7, p = 0.17]. Hence, 20%, 19%, and 43% of variance remained unexplained by the factors investigated here. Calculations in CIELAB provided similar results. For details, see subsection “Multiple Regressions” in the Supplementary Material
Discussion
In sum, metamer mismatch areas were strongly correlated with our three measurements of color constancy that is (a) the deviation of color adjustments from the color shift predicted by naturalistic surfaces, (b) the consensus of adjustments across observers, and (c) the stability of adjustments across repeated measurements. These correlations were specific to the identification of colors across illumination changes, as demonstrated by the independence of the correlations from the control condition and from different color spaces. Hence, they reveal a strong relationship between color constancy and metamer mismatching. At the same time, sensory singularities and color categories were not related to our measures of color constancy. Moreover, cone ratios did not predict observers' color matches well compared with predictions based on naturalistic surfaces and Munsell chips; but evidence for a relationship between cone ratios and metamer mismatching was found that was largely unrelated to color constancy. 
Uncertainty of surface colors
Metamer mismatching implies that the colors of surfaces are highly uncertain in the absence of knowledge about reflectance properties of the surfaces. The relationship between color constancy and metamer mismatching suggests that color constancy directly depends on this uncertainty about color changes across illuminations. 
However, metamer mismatch volumes did not completely explain the variation of color adjustments across illuminations. Even including all the other factors left 19% to 43% of variance unexplained. The remaining unexplained variation of the adjustments may be due to several reasons. 
First, the uncertainty about the color change does not only depend on the size of the metamer mismatch volumes, but also on the probability density of mismatches across the volume (Logvinenko & Demidenko, 2016). We did not account for that, which might have reduced the observed correlations. It would be good to account for probability densities in future work so as to yield still higher correlations between color constancy and metamer mismatching. 
Second, it would be important to determine metamer mismatch volumes for metamers in the visual environment rather than the theoretical metamer mismatch volumes used here. Most real reflectances in the visual environment have particular properties (Koenderink, 2010), and hence some metamer mismatches within the metamer mismatch volume are much more probable to occur in the visual environment than others. 
Moreover, it has been shown that metamerism rarely occurs within a natural scene (Foster, Amano, Nascimento, & Foster, 2006). If metamerism is rare, metamer mismatching would also be rare and it would be of little use for the visual system to adapt to the uncertainties represented by the theoretical metamer mismatch volumes. However, higher levels of metamerism may exist going from one natural scene to another, implying that metamer mismatching might be much higher than suggested by the above study (Foster et al., 2006). In fact, other studies observed that real surfaces show considerable degrees of metamerism and metamer mismatching (Logvinenko et al., 2015; Zhang et al., 2015, 2016). Moreover, metamer mismatching is known to be an important problem in many practical contexts, such as printing (e.g., Fairchild & Johnson, 2004; Samadzadegan & Urban, 2013), lighting (e.g., Schanda, 2007; Viénot, Coron, & Lavédrine, 2011), photography (e.g., Belt, 2008), and art restoration (e.g., Berns, 2016). This shows that metamer mismatching for real surfaces in the visual environment is considerable. 
Nevertheless, the metamer mismatch volumes for reflectances in the visual environment are still considered to be much smaller than the complete metamer mismatch volumes used here (Zhang et al., 2016). For this reason, it would be important to test whether the correlations observed in the present study still hold for metamer mismatching that occurs in the visual environment. 
Third, the consideration of the complete metamer mismatch volumes assumes the absence of any prior knowledge about the reflectance properties of surfaces. However, observers may have some prior knowledge that reduces the possible mismatches to a subvolume of the metamer mismatch volume. This would explain why color adjustments (white disks in Figure 2) did not spread across the whole metamer mismatch volume. We did not find evidence for such prior knowledge in the present study. In particular, observers' matches were not closer to the target color predicted by the natural surfaces than to the one predicted by Munsell chips (red and green bar in Figure 10). However, reflectance properties of Munsell chips are quite similar to those of natural reflectances (Jaaskelainen et al., 1990; Witzel et al., 2015), and this may explain why Munsell chip predictions were as close to observers' adjustments as natural surface predictions. Further investigations of this question are needed. 
Fourth, the uncertainty about color changes due to metamer mismatching does not only depend on the uncertainty about reflectances, but also on the uncertainty about the spectral properties of the illumination, i.e., the illuminant (Logvinenko et al., 2014; Logvinenko et al., 2015). In this study, we assumed that observers have prior knowledge about color changes under natural illuminants. While this is a plausible assumption in general, illuminants that occur in the visual environment are not all like those investigated here, and observers may also have a certain degree of uncertainty about illuminants. Incorporating illuminants into the calculation of uncertainty due to metamer mismatching is another way to further clarify the role of metamer mismatching in color constancy. 
Finally, color matches across illuminants might be affected by the way we implemented the task on the computer monitor. There are two important aspects to note. One concerns the use of a simultaneous presentation of the scene under the two illuminations. We did this because we wanted to avoid effects of color memory and to show color constancy beyond adaptation. At the same time, this stimulus presentation implies partial adaptation to the colors from each of the two images. The correlations between color constancy measures and target shifts might reflect effects of partial adaptation on color constancy. However, target shifts may not completely capture the effects of partial adaptation and unexplained variance of our measures might still be due to partial adaptation. The question arises whether the relationship between metamer mismatching and color constancy can be shown when observers are completely adapted to the respective illuminations, as, for example, in successive color constancy. 
The second consideration about the task is the fact that we showed the scenes under different illuminations as images on the computer screen. These images must be sufficiently realistic to convince observers that the difference between the images is due to a change in illumination. We maximized the realism of these images under the constraint that they should not contain color-diagnostic objects (objects with typical colors) and that all color changes can be rendered within the monitor gamut. Moreover, we emphasized in the instructions that images are photos under different illuminations. Nevertheless, the mere fact that the task involved adjusting a color in one of these images undermines the realism of the images. As a result, some observers might have made light matches (i.e., matching the color of the light emitted by the monitor, independent of the illumination in the image) rather than surface matches (i.e., matching the color of the surface or object depending on the illumination in the image). This would also explain why adjustments were shifted to the color under the original illumination (cf. white and black disks in Figure 2 & Figures S1 & S2). In fact, a recent study has measured asymmetric matches using real surfaces and found that in this case most matches fall within the metamer mismatch volumes (Logvinenko et al., 2015). It would be important to test the correlations observed here with data for real surfaces. 
Despite all these factors that could have counteracted the relationship between color constancy and metamer mismatching in the present study, the observed correlations still explained a large part of the variance in color matches across illuminations by the size of metamer mismatch volumes. In particular, we observed an asymmetry between color constancy across colors when light changes from yellow to blue and when light changes from blue to yellow (cf. Figure S3 & Table S1). Unlike all the other factors metamer mismatching may account for this asymmetry by the fact that the size of metamer mismatch volumes is also asymmetric in a way that predicts the asymmetry of color constancy performance. 
The observation of a strong relationship between color constancy and metamer mismatching shows that the uncertainty about the sensory signal due to metamer mismatching determines color constancy in our simultaneous matching task. This effect of uncertainty must be lower in situations in which the perceived color is strongly determined through adaptation. However, in situations in which adaptation is not or only partially possible, color constancy cannot be understood without considering the role of metamer mismatching. 
This finding suggests that observers have acquired implicit, perceptual knowledge about the uncertainty of the sensory signal through their experiences with changes in illumination. The acquisition of this knowledge might be the reason why the development of color constancy and color categorization in toddlers are correlated (Rogers, Witzel, Rhodes, & Franklin, 2016; Witzel, Sanchez-Walker et al., 2013). Both color constancy and color categorization require toddlers to understand how perceived colors in the visual environment vary, while being still identified as the same surface color or color category, respectively. The importance of knowledge about the uncertainty of colors under illumination change might also explain why color constancy is related to memory (Allen, Beilock, & Shevell, 2011, 2012; Olkkonen & Allred, 2014). 
Understanding surface color
Color must be understood, not only as a property of light, but also—if not mainly—as a property of objects and surfaces. At the same time, metamer mismatching implies that the change of the sensory signal from one to another illumination cannot be determined (Logvinenko et al., 2015). Consequently, it is impossible to define surface colors as constant across illumination changes if observers have no prior knowledge about the spectral properties of the surfaces and the illumination. If observers had perfect prior knowledge about the reflectance properties of the surfaces, adjustments would cluster around a target color defined by those properties, and there should not be a correlation with metamer mismatching. Our results show the contrary, namely that observers do not aim at a predefined target color, and are led by metamer mismatching in their performance in identifying surface colors across illuminations. 
These findings explain why the results of classical studies on color constancy have strongly varied across studies and across observers (Foster, 2011; see also figures 3 and 8 of Radonjic & Brainard, 2016). In the light of metamer mismatching, classical color constancy indices must vary—as a logical necessity. The size of these indices does not only depend on the surface colors and illuminations used in the respective studies; it also depends on the researcher's and the observers' prior assumptions about the surfaces and illuminations that define a target color or target region within the metamer mismatch volumes. 
At the same time, observations of high color constancy indices (Gegenfurtner et al., 2015; Hansen et al., 2007; Olkkonen et al., 2009; Olkkonen et al., 2010) allow for formulating new questions in the light of metamer mismatching. In particular, under certain conditions adaptation and local color contrast are very much in line with the observers' performance of color identification across illuminations (e.g., Hansen et al., 2007; Kraft & Brainard, 1999). Under illumination changes, adaptation and local contrast produce a fixed shift in color perception to a particular point in the metamer mismatch volume. The question arises of why the shift to a particular color produces high color constancy indices in those conditions, given that all other colors within the metamer mismatch volume provide a theoretically possible solution. 
The relevance of metamer mismatching to color constancy also suggests an alternative to the idea that trichromatic color vision evolved to discriminate reddish from greenish hues (Regan et al., 2001). Adaptation strongly depends on the cone-opponency of the visual system (Krauskopf & Gegenfurtner, 1992; Krauskopf, Williams, & Heeley, 1982). It may be that the visual system is shaped so that adaptation allows for color constancy under the particular conditions of the natural visual environment. In this case, the shift in color vision due to color adaptation would be a functional adaptation to the spectral properties of surfaces and illuminants in the natural visual environment. Consequently, human color vision could be the result of optimizing adaptation with respect to color constancy in the natural environment (Maloney & Wandell, 1986; Shepard, 2001). 
Finally, we observed that the uncertainty of cone ratio predictions was related to metamer mismatch volumes. Although, this relationship did not explain the role of metamer mismatch volumes for observers' matches, it would be important to understand the relationship between cone ratios and metamer mismatch volumes in order to achieve a more complete understanding of the role of metamer mismatch volumes in color constancy. 
Color categorization and perception
Previous studies have observed a relationship between the consistency of linguistic color categories across observers and across illuminations. Colors at the centers of the categories are categorized more consistently than others (Olkkonen et al., 2009; Olkkonen et al., 2010; Witzel, Flack et al., 2013). Evidence from sensory singularities suggested that the reflected signal is more reliable and predictable across illumination changes for category prototypes than for other colors (Philipona & O'Regan, 2006; Vazquez-Corral et al., 2012; Witzel & O'Regan, 2014). If observers use the higher predictability due to sensory singularities to achieve color constancy, colors corresponding to category prototypes should be particularly constant across illumination changes. This would also explain why colors close to the prototypes at the center of the categories are categorized with highest consistency. If prototypes are perceived as particularly constant across illuminations, they could act as perceptual anchors that provide stable points of reference for color categorization. This particular perceptual property would qualify category prototypes as focal colors (Witzel et al., 2015). 
Our test of color constancy peaks around prototypes depended on assumptions about the typicality of the hues of our stimulus colors. These assumptions were based on prototype measurements of previous studies (Olkkonen et al., 2010; Witzel, Flack et al., 2013). However, we needed to reduce the saturation of these hues to fit them into the monitor gamut for the purpose of our adjustment task. This might potentially have affected their typicality. Nevertheless, the fact that our selection of typical hues yielded higher consistencies than boundary hues for green, blue, and to some extent for yellow (Figure 9ac) indicates that those colors are at least closer to the category centers than our boundary colors. This, however, seems not to be the case for the red category, which is comparatively small and strongly depends on saturation (Olkkonen et al., 2010; Witzel & Gegenfurtner, 2013, 2016). For this reason, we consider the negative results concerning the color constancy peaks at category prototypes as preliminary evidence only. 
Instead, the main results concern the correlation between the measures of color constancy, the singularity indices, and the strength of category membership, as assessed by category consensus and category consistency. However, there was neither a relationship between categories and color constancy nor a relationship between sensory singularities and color constancy. 
One reason for the absence of correlations between color categories, sensory singularities, and color matches may be the small size of our stimulus sample (only n = 12 and n = 24). Another reason may be the fact that the Munsell chips used in previous studies were more saturated around prototypical colors (Collier, 1973). Higher saturation may produce higher constancy, while categorization is highest around the centers (Witzel et al., 2015; Witzel & Franklin, 2014). However, the variation of saturation in previous studies is a particularity of the stimulus sample used in those studies, not of human color perception (Witzel & Franklin, 2014; Witzel, Maule, & Franklin, under revision). This particularity of the stimulus sample may have produced spurious correlations between categorization, constancy, and sensory singularities (Witzel et al., 2015; Witzel & Franklin, 2014). 
Arguably, it might be possible that small correlations between sensory singularities and color constancy may be revealed with higher statistical power by using a larger set of stimulus colors. Nevertheless, to the extent that our findings can be believed, they suggest that sensory singularities are largely irrelevant for color constancy, and they contradict the idea that the distribution of categories across color space is organized around focal colors that act as perceptual anchors due to particularly high color constancy (Witzel et al., 2015; Witzel & Franklin, 2014). 
These findings contribute to the body of research that attempted to establish a relationship between color perception and color categorization. These studies have shown that color categories do not coincide with low-level sensory mechanisms of color vision (Malkoc, Kay, & Webster, 2005; Witzel & Gegenfurtner, 2013). Color categories are also not related to hue discrimination (Witzel & Gegenfurtner, 2011, 2013, 2014, 2015, 2016) nor to the perception of saturation (Witzel & Franklin, 2014; Witzel et al., under revision). Evidence for a certain cross-linguistic stability of categories supported a relationship between color perception and categorization (Kay & Regier, 2003; Lindsey, Brown, Brainard, & Apicella, 2015; Regier, Kay, & Cook, 2005; Regier, Kay, & Khetarpal, 2007). However, recent studies showed that this stability could be explained by the lack of control of saturation in the stimulus sample (Witzel et al., 2015; Witzel, 2016). Moreover, the sensory singularities of category prototypes disappear when using a set of more uniformly saturated Munsell chips that control for spurious effects of saturation (Witzel et al., 2015). Now, the present results undermine the idea that sensory singularities and color categories are related to color constancy. In this way, they cast further doubt on the idea that color categories are directly related to color perception. 
Nevertheless, there might still be a relationship between categorization across observers (category consensus) and across illuminations (category constancy), as observed previously (Olkkonen et al., 2009; Olkkonen et al., 2010; Witzel, Flack et al., 2013). Our results confirmed such a relationship, even though it was by far not as strong as observed in those previous studies. In any case, our results from the matching task suggest that this relationship is not due to the particular stability of color perception for category prototypes. Instead, it may be rather the result of linguistic categorization. Colors at category boundaries are categorized less consistently because category membership in general is fuzzy close to the category boundaries. The fuzziness of category boundaries may be the reason why consistencies in color naming vary in a very similar way across observers and across illuminations. 
Conclusion
In the present study, we tested different high-level (color categories) and sensory factors (metamer mismatching, sensory singularities, and cone ratios) that are likely to affect performance in color constancy beyond what is predicted by adaptation. Results showed that a considerable degree of uncertainty (about 50%) in judging colors across illuminations is explained by the size of metamer mismatch volumes. In contrast, sensory singularities, color categories, and cone ratios are not related to the variation of color estimations across illuminations. It seems then that the relationship between color naming and color constancy cannot be due to a particular level of color constancy for category prototypes and may be rather the result of linguistic categorization. In the context of previous studies, these results raise further doubts that category prototypes have special features that qualify them as focal colors. 
Most importantly, the strong relationship between color constancy and metamer mismatch volumes highlights the importance of the uncertainty of sensory information in color constancy. In particular, relatively high-level judgements of color appearance under changes of illumination are considerably shaped by the uncertainty of the sensory signal. These findings show that observers know, probably from experience, how uncertain different sensory signals are when illumination changes. The implication of these findings would then be that color constancy is not a question of a rigid inbuilt brain mechanism, but a learnt adaptation to environmental dynamics (Philipona & O'Regan, 2006; Purves, Lotto, Williams, Nundy, & Yang, 2001; Skorupski & Chittka, 2011; Witzel et al., 2015; Witzel, Sanchez-Walker, et al., 2013). 
More generally, the present findings demonstrate the importance of metamer mismatching for color constancy, and hence suggest a paradigmatic change in the study of color constancy. Up to now, researchers have used well-defined target colors, based on reference surfaces chosen by the researchers. In the light of the current findings this approach is no longer viable. Now, metamer mismatching needs to be taken into account when (a) describing the phenomenon of color constancy, (b) investigating color constancy, and (c) more generally when considering color as a perceptual attribute of objects and surfaces. 
Acknowledgments
This research was supported by ERC Advanced Grant “FEEL” number 323674 to J. Kevin O'Regan. 
Commercial relationships: none. 
Corresponding author: Christoph Witzel. 
Email: cwitzel@daad-alumni.de. 
Address: Justus-Liebig-Universität, Gießen, Germany. 
References
Allen, E. C, Beilock, S. L,& Shevell, S. K. (2011). Working memory is related to perceptual processing: A case from color perception. Journal of Experimental Psychology: Learning, Memory, & Cognition, 37 (4), 1014–1021, doi:10.1037/a0023257.
Allen, E. C, Beilock, S. L,& Shevell, S. K. (2012). Individual differences in simultaneous color constancy are related to working memory. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 29 (2), A52–59, doi:10.1364/JOSAA.29.000A52.
Arend, L. E,& Reeves, A. (1986). Simultaneous color constancy. Journal of the Optical Society of America A, 3 (10), 1743–1751.
Belt, A. F. (2008). The elements of photography: Understanding and creating sophisticated images. Oxford, England: Elsevier.
Berns, R. S. (2016.) Color science and the visual arts: a guide for conservators, curators, and the curious. Los Angeles, CA: Getty Conservation Institute.
Castellarin, I. (2000). Le proprietà statistiche della riflettanza in un campione di superfici naturali. (Unpublished thesis). University of Trieste, Trieste, Italy.
CNRS. (2015). Relais d'Information sur les Sciences de la Cognition (RISC). Centre National de la Recherche Scientifique (CNRS). Retrieved from http://www.risc.cnrs.fr/
Collier, G. A. (1973). Review of “Basic color terms: Their universality and evolution.” Language, 49 (1), 245–248.
Fairchild, M. D,& Johnson, G. M. (2004). METACOW: A public-domain, high-resolution, fully-digital, noise-free, metameric, extended-dynamic-range, spectral test target for imaging system analysis and simulation. In C. F. I. S. Munsell Color Science Laboratory (Ed.). Rochester, NY: Rochester Institute of Technology.
Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 10 (4), 507–521.
Foster, D. H. (2003). Does colour constancy exist? Trends in Cognitive Sciences, 7 (10), 439–443, doi: S1364661303001980 [pii].
Foster, D. H. (2011). Color constancy. Vision Research, doi: S0042-6989(10)00440-2 [pii] 10.1016/j.visres.2010.09.006.
Foster, D. H, Amano, K, Nascimento, S. M,& Foster, M. J. (2006). Frequency of metamerism in natural scenes. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 23 (10), 2359–2372.
Foster, D. H,& Nascimento, S. M. (1994). Relational colour constancy from invariant cone-excitation ratios. Proceedings of the Royal Society B: Biological Sciences, 257 (1349), 115–121, doi:10.1098/rspb.1994.0103.
Gegenfurtner, K. R, Bloj, M. G,and Weiß, D. (2015). Real color constancy. Paper presented at the European Conference of Visual Perception, Liverpool, England.
Granzier, J. J. M,& Gegenfurtner, K. R. (2012). Effects of memory colour on colour constancy for unknown coloured objects. i-Perception, 3 (3), 190–215, doi:10.1068/i0461.
Hansen, T, Olkkonen, M, Walter, S,& Gegenfurtner, K. R. (2006). Memory modulates color appearance. Nature Neuroscience, 9 (11), 1367–1368, doi: nn1794 [[pii], 10.1038/nn1794.
Hansen, T, Walter, S,& Gegenfurtner, K. R. (2007). Effects of spatial and temporal context on color categories and color constancy. Journal of Vision, 7 (4): 2, 1–15, doi:10.1167/7.4.2. [PubMed] [Article]
Hurlbert, A. C. (2007). Colour constancy. Current Biology, 17 (21), R906–907, doi: S0960-9822(07)01839-8 [pii] 10.1016/j.cub.2007.08.022.
Ishihara, S. (2004). Ishihara's tests for colour deficiency. Tokyo, Japan: Kanehara Trading Inc.
Jaaskelainen, T, Parkkinen, J,& Toyooka, S. (1990). Vector-subspace model for color representation. Journal of the Optical Society of America A, 7 (4), 725–730, doi: 10.1364/JOSAA.7.000725.
Judd, D. B, Macadam, D. L, Wyszecki, G, Budde, H. W, Condit, H. R, Henderson, S. T,& Simonds, J. L. (1964). Spectral distribution of typical daylight as a function of correlated color temperature. Journal of the Optical Society of America, 54 (8), 1031–1036.
Kanematsu, E,& Brainard, D. H. (2013). No measured effect of a familiar contextual object on color constancy. Color Research & Application, 39 (4), 347–359, doi:10.1002/col.21805.
Kay, P,& Regier, T. (2003). Resolving the question of color naming universals. Proceedings of the National Academy of Sciences, USA, 100 (15), 9085–9089.
Koenderink, J. J. (2010). The prior statistics of object colors. Journal of the Optical Society of America A, 27 (2), 206–217.
Kohonen, O, Parkkinen, J,& Jaaskelainen, T. (2006). Databases for spectral color science. Color Research and Application, 31 (5), 381–390, doi: Doi 10.1002/Col.20244.
Kraft, J. M,& Brainard, D. H. (1999). Mechanisms of color constancy under nearly natural viewing. Proceedings of the National Academy of Sciences, USA, 96 (1), 307–312.
Krauskopf, J,& Gegenfurtner, K. R. (1992). Color discrimination and adaptation. Vision Research, 32 (11), 2165–2175.
Krauskopf, J, Williams, D. R,& Heeley, D. W. (1982). Cardinal directions of color space. Vision Research, 22 (9), 1123–1131.
Lindsey, D. T, Brown, A. M, Brainard, D. H,& Apicella, C. L. (2015). Hunter-gatherer color naming provides new insight into the evolution of color terms. Current Biology, 25 (18), 2441–2446, doi:10.1016/j.cub.2015.08.006.
Logvinenko, A. D,& Demidenko, E. (2016). On counting metamers. IEEE Transactions on Image Processing, 25 (2), 770–775, doi:10.1109/TIP.2015.2504900.
Logvinenko, A. D, Funt, B,& Godau, C. (2014). Metamer mismatching. IEEE Transactions on Image Processing, 23 (1), 34–43, doi:10.1109/TIP.2013.2283148.
Logvinenko, A. D, Funt, B, Mirzaei, H,& Tokunaga, R. (2015). Rethinking colour constancy. PLoS One, 10 (9), e0135029, doi:10.1371/journal.pone.0135029.
Malkoc, G, Kay, P,& Webster, M. A. (2005). Variations in normal color vision. IV. Binary hues and hue scaling. Journal of the Optical Society of America A, 22 (10), 2154–2168.
Maloney, L. T,& Wandell, B. A. (1986). Color constancy: A method for recovering surface spectral reflectance. Journal of the Optical Society of America A, 3 (1), 29–33.
Nascimento, S. M, de Almeida, V. M, Fiadeiro, P. T,& Foster, D. H. (2004). Minimum-variance cone-excitation ratios and the limits of relational color constancy. Visual Neuroscience, 21 (3), 337–340.
Nascimento, S. M,& Foster, D. H. (1997). Detecting natural changes of cone-excitation ratios in simple and complex coloured images. Proceedings of the Royal Society B: Biological Sciences, 264 (1386), 1395–1402, doi:10.1098/rspb.1997.0194.
Olkkonen, M,& Allred, S. R. (2014). Short-term memory affects color perception in context. PLoS One, 9 (1), e86488, doi:10.1371/journal.pone.0086488.
Olkkonen, M, Hansen, T,& Gegenfurtner, K. R. (2009). Categorical color constancy for simulated surfaces. Journal of Vision, 9 (12): 6, 1–18, doi:10.1167/9.12.6. [PubMed] [Article]
Olkkonen, M, Witzel, C, Hansen, T,& Gegenfurtner, K. R. (2010). Categorical color constancy for real surfaces. Journal of Vision, 10 (9): 16, 1–22, doi:10.1167/10.9.16. [PubMed] [Article]
Parkkinen, J. P. S, Hallikainen, J,& Jaaskelainen, T. (1989). Characteristic spectra of Munsell colors. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 6 (2), 318–322, doi: Doi 10.1364/Josaa.6.000318.
Philipona, D. L,& O'Regan, J. K. (2006). Color naming, unique hues, and hue cancellation predicted from singularities in reflection properties. Visual Neuroscience, 23 (3-4), 331–339, doi: S0952523806233182 [pii] 10.1017/S0952523806233182.
Purves, D, Lotto, R. B, Williams, S. M, Nundy, S,& Yang, Z. (2001). Why we see things the way we do: Evidence for a wholly empirical strategy of vision. Philosophical Transactions of the Royal Society B: Biological Sciences, 356 (1407), 285–297, doi:10.1098/rstb.2000.0772.
Radonjic, A,& Brainard, D. H. (2016). The nature of instructional effects in color constancy. Journal of Experimental Psychology: Human Perception & Performance, 42 (6), 847–865, doi:10.1037/xhp0000184.
Regan, B. C, Julliot, C, Simmen, B, Vienot, F, Charles-Dominique, P,& Mollon, J. D. (2001). Fruits, foliage and the evolution of primate colour vision. Philosophical Transactions of the Royal Society B: Biological Sciences, 356 (1407), 229.
Regier, T, Kay, P,& Cook, R. S. (2005). Focal colors are universal after all. Proceedings of the National Academy of Sciences, USA, 102 (23), 8386–8391, doi: 0503281102 [pii] 10.1073/pnas.0503281102.
Regier, T, Kay, P,& Khetarpal, N. (2007). Color naming reflects optimal partitions of color space. Proceedings of the National Academy of Sciences, USA, 104 (4), 1436–1441, doi: 0610341104 [pii] 10.1073/pnas.0610341104.
Rogers, M, Witzel, C, Rhodes, P,& Franklin, A. (2016). The maturity of colour constancy and colour term knowledge are positively related in early childhood. Paper presented at Progress in Colour Studies, London.
Samadzadegan, S,& Urban, P. (2013). Spatially resolved joint spectral gamut mapping and separation. Color and Imaging Conference, 2013 (1), 2–7.
Schanda, J. (2007). Color rendering of light sources. In Schanda J. (Ed.), Colorimetry (pp. 207–217. New York: John Wiley & Sons, Inc.
Shepard, R. N. (2001). Perceptual-cognitive universals as reflections of the world. Behavioral and Brain Sciences, 24 (4), 581–601; discussion 652–571.
Skorupski, P,& Chittka, L. (2011). Is colour cognitive? Optics & Laser Technology, 43 (2), 251–260.
Smith, W,& Adnin, M. M. (2015). Mayang's Free Texture Library. Retrieved from http://www.mayang.com/textures/
Smithson, H. E. (2005). Sensory, computational and cognitive components of human colour constancy. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 360 (1458), 1329–1346, doi: PX26MA7W586VQ2A7 [pii] 10.1098/rstb.2005.1633.
Van Alphen, C, Witzel, C, Godau, C,and O'Regan, J. K. (2015). Colour constancy predicted by metameric mismatch volumes. Paper presented at the ECVP 2015, Liverpool, England.
Vazquez-Corral, J, O'Regan, J. K, Vanrell, M,& Finlayson, G. D. (2012). A new spectrally sharpened sensor basis to predict color naming, unique hues, and hue cancellation. Journal of Vision, 12(6), 7, 1–14, doi:10.1167/12.6.7. [PubMed] [Article]
Viénot, F, Coron, G,& Lavédrine, B. (2011). LEDs as a tool to enhance faded colours of museums artefacts. Journal of Cultural Heritage, 12 (4), 431–440, doi:10.1016/j.culher.2011.03.007.
Westland, S, Owens, H. C,& Shaw, A. J. (2000). Colour statistics of natural and man-made surfaces. Sensor Review, 20, 50–55.
Witzel, C. (2016). New insights into the evolution of color terms or an effect of saturation? i-Perception, 7 (5), 1–4, doi:10.1177/2041669516662040.
Witzel, C, Cinotti, F,& O'Regan, J. K. (2015). What determines the relationship between color naming, unique hues, and sensory singularities: Illuminations, surfaces, or photoreceptors? Journal of Vision, 15 (8): 19, 1–32, doi:10.1167/15.8.19. [PubMed] [Article]
Witzel, C, Flack, Z,& Franklin, A. (2013). Categorical colour constancy during colour term acquisition. Paper presented at the AIC 2013 12th international AIC Congress, Newcastle upon Tyne, UK.
Witzel, C,& Franklin, A. (2014). Do focal colors look particularly “colorful”? Journal of the Optical Society of America A, Optics, image science, and Vision, 31 (4), A365–374, doi:10.1364/JOSAA.31.00A365.
Witzel, C,& Gegenfurtner, K. R. (2011). Is there a lateralized category effect for color? Journal of Vision, 11 (12): 16, 1–25, doi:10.1167/11.12.16. [PubMed] [Article]
Witzel, C,& Gegenfurtner, K. R. (2013). Categorical sensitivity to color differences. Journal of Vision, 13 (7): 1, 1–33, doi:10.1167/13.7.1. [PubMed] [Article]
Witzel, C,& Gegenfurtner, K. R. (2014). Category effects on colour discrimination. In Anderson, W. Biggam, C. P. Hough, C. A.& Kay C. J. (Eds.), Colour studies: A broad spectrum (pp. 200–211. Amsterdam, the Netherlands: John Benjamin Publishing Company.
Witzel, C,& Gegenfurtner, K. R. (2015). Categorical facilitation with equally discriminable colors. Journal of Vision, 15 (8): 22, 1–33, doi:10.1167/15.8.22. [PubMed] [Article]
Witzel, C,& Gegenfurtner, K. R. (2016). Categorical perception for red and brown. Journal of Experimental Psychology: Human Perception & Performance, 42 (4), 540–570, doi:10.1037/xhp0000154.
Witzel, C, Maule, J,& Franklin, A. (under revision). Are red, yellow, green, and blue particularly “colorful”?
Witzel, C,& O'Regan, J. K. (2014). Color appearance and color language depend on sensory singularities in the natural environment. Perception, 43 ECVP Abstract Supplement, 67.
Witzel, C, Sanchez-Walker, E,& Franklin, A. (2013). The development of categorical colour constancy. Perception, 42(ECVP Abstract Supplement), 19.
Witzel, C, Valkova, H, Hansen, T,& Gegenfurtner, K. R. (2011). Object knowledge modulates colour appearance. i-Perception, 2 (1), 13–49, doi:10.1068/i0396.
Wyszecki, G,& Stiles, W. S. (1982). Color science: Concepts and methods, quantitative data and formulae (2nd ed.). New York: John Wiley & Sons.
Zaidi, Q,& Bostic, M. (2008). Color strategies for object identification. Vision Research, 48 (26), 2673–2681, doi: S0042-6989(08)00325-8 [pii] 10.1016/j.visres.2008.06.026.
Zhang, X, Funt, B,& Mirzaei, H. (2015). Metamer mismatching and its consequences for predicting how colours are affected by the illuminant. Paper presented at the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
Zhang, X, Funt, B,& Mirzaei, H. (2016). Metamer mismatching in practice versus theory. Journal of the Optical Society of America A, 33 (3), A238–A247, doi:10.1364/JOSAA.33.00A238.
Figure 1
 
Metamer mismatching. Panel a shows the reflectances of two example surfaces that are metameric under illuminant 1. Metamerism means that these reflectances produce exactly the same color under illuminant 1 (Panel b). However, they result in different colors under illuminant 2 (Panel c); this is called metamer mismatching. For illustrations of the effects of metamer mismatching in real life and practical contexts see the section “The uncertainty of surface colors” of the Discussion.
Figure 1
 
Metamer mismatching. Panel a shows the reflectances of two example surfaces that are metameric under illuminant 1. Metamerism means that these reflectances produce exactly the same color under illuminant 1 (Panel b). However, they result in different colors under illuminant 2 (Panel c); this is called metamer mismatching. For illustrations of the effects of metamer mismatching in real life and practical contexts see the section “The uncertainty of surface colors” of the Discussion.
Figure 2
 
Illustration of metamer mismatch volumes. Two examples of metamer mismatch volumes are shown in CIELUV space. Metamer mismatch volumes are projected on the u*v* plane (metamer mismatch area) since lightness was fixed in the experiment. Panel a shows the metamer mismatch area of a yellowish green surface (Munsell chip 7.5Y4/6); Panel b shows the one of a purplish red surface (7.5PB5/8). The illuminant change corresponds with the change from the yellowish (D50) to the blueish (CCT of 12000K) illuminant used in the experiment. The black disk in each panel indicates the sensory signal under the original yellowish illumination. The red and the green disks correspond to the sensory signal under the blue illuminant of a Munsell chip (red) and of a prototypical natural surface (green) that are metameric under the yellow illuminant (black disk). Small white disks show average adjustments of observers, the gray line represents the first principal component of those adjustments, and the percentage in the upper right corner indicates the variance explained by the first principal component. The graphics for the other colors and conditions may be found in Figure S1 and Figure S2 of the Supplementary Material.
Figure 2
 
Illustration of metamer mismatch volumes. Two examples of metamer mismatch volumes are shown in CIELUV space. Metamer mismatch volumes are projected on the u*v* plane (metamer mismatch area) since lightness was fixed in the experiment. Panel a shows the metamer mismatch area of a yellowish green surface (Munsell chip 7.5Y4/6); Panel b shows the one of a purplish red surface (7.5PB5/8). The illuminant change corresponds with the change from the yellowish (D50) to the blueish (CCT of 12000K) illuminant used in the experiment. The black disk in each panel indicates the sensory signal under the original yellowish illumination. The red and the green disks correspond to the sensory signal under the blue illuminant of a Munsell chip (red) and of a prototypical natural surface (green) that are metameric under the yellow illuminant (black disk). Small white disks show average adjustments of observers, the gray line represents the first principal component of those adjustments, and the percentage in the upper right corner indicates the variance explained by the first principal component. The graphics for the other colors and conditions may be found in Figure S1 and Figure S2 of the Supplementary Material.
Figure 3
 
Stimulus display. Participants had to adjust the patch with the black dot so that it matches the corresponding patch in the scene with the other illumination. The starting color for the adjustment was random.
Figure 3
 
Stimulus display. Participants had to adjust the patch with the black dot so that it matches the corresponding patch in the scene with the other illumination. The starting color for the adjustment was random.
Figure 4
 
Types of scenes. Images were taken from Mayang's Free Textures Library (Smith & Adnin, 2015).
Figure 4
 
Types of scenes. Images were taken from Mayang's Free Textures Library (Smith & Adnin, 2015).
Figure 5
 
Illuminations. Panel a shows the spectral power distribution of the two illuminants. Panel b illustrates the location of the chromaticities of the two illuminants (yellow and blue disks) relative to the daylight locus (thick line). Note that the chromaticities of the illuminants were very close to the daylight locus.
Figure 5
 
Illuminations. Panel a shows the spectral power distribution of the two illuminants. Panel b illustrates the location of the chromaticities of the two illuminants (yellow and blue disks) relative to the daylight locus (thick line). Note that the chromaticities of the illuminants were very close to the daylight locus.
Figure 6
 
Principal components of the natural reflectances measured by Westland et al. (2000). The three curves show the first (blue), second (green), and third (red) principal components. The percentage (98%) reports the variance of the natural reflectances explained by the three principal components.
Figure 6
 
Principal components of the natural reflectances measured by Westland et al. (2000). The three curves show the first (blue), second (green), and third (red) principal components. The percentage (98%) reports the variance of the natural reflectances explained by the three principal components.
Figure 7
 
Illustration of a 5-transition reflectance. This kind of reflectance defines the hull (“outer skin”) of the metamer mismatch volumes.
Figure 7
 
Illustration of a 5-transition reflectance. This kind of reflectance defines the hull (“outer skin”) of the metamer mismatch volumes.
Figure 8
 
Correlations between metamer mismatch areas and color constancy. The first row (Panels a–c) shows simple correlations, and the second row (Panels d–f) shows partial correlations controlling for the performance in the control condition. In all panels of the first row, the x-axis shows the size of metamer mismatch areas. The y-axis indicates Euclidean distances in CIELUV-space. In Panel a, these distances correspond with the differences from the color predicted by simulated natural surfaces; in Panel b they refer to the differences of individual adjustments from the average across individuals and in Panel c to the differences between the two repeated measurements. In the second row, the x-axis and the y-axis show the residuals from the regression that accounts for the performance in the control condition. Each symbol corresponds to one of the 12 colors under one of two illumination changes. Disks refer to illumination changes from yellow to blue, diamonds from blue to yellow. The correlations between the distances along the y-axes and the metamer mismatch volumes (x-axes) are given in the upper right corner of each panel (***p < 0.001) and are illustrated by the regression line (in gray). Note that results are practically the same when controlling for variance of adjustments without illumination change (Panels df), and were highly similar when doing these calculations in CIELAB instead of CIELUV space (Figure S4 of the Supplementary Material).
Figure 8
 
Correlations between metamer mismatch areas and color constancy. The first row (Panels a–c) shows simple correlations, and the second row (Panels d–f) shows partial correlations controlling for the performance in the control condition. In all panels of the first row, the x-axis shows the size of metamer mismatch areas. The y-axis indicates Euclidean distances in CIELUV-space. In Panel a, these distances correspond with the differences from the color predicted by simulated natural surfaces; in Panel b they refer to the differences of individual adjustments from the average across individuals and in Panel c to the differences between the two repeated measurements. In the second row, the x-axis and the y-axis show the residuals from the regression that accounts for the performance in the control condition. Each symbol corresponds to one of the 12 colors under one of two illumination changes. Disks refer to illumination changes from yellow to blue, diamonds from blue to yellow. The correlations between the distances along the y-axes and the metamer mismatch volumes (x-axes) are given in the upper right corner of each panel (***p < 0.001) and are illustrated by the regression line (in gray). Note that results are practically the same when controlling for variance of adjustments without illumination change (Panels df), and were highly similar when doing these calculations in CIELAB instead of CIELUV space (Figure S4 of the Supplementary Material).
Figure 9
 
Color categories and constancy. In all panels the x-axis lists the 12 stimuli, and the groups of bars refer to the four color categories (red, yellow, green, and blue). The center bars in each group correspond to the typical, the other bars to the boundary colors. Panels a, b, and c show category consensus (consistency of categorization across observers), intra-individual category consistency (consistency of categorization across repeated measurements), and category constancy (consistency of categorization across illuminations) in the supplementary naming task. Panel d illustrates perceptual color constancy performance in the asymmetric matching task by showing the deviation from the natural surface prediction (for other measures of color constancy see Figures S9, S10). In all panels, error bars show standard errors of mean. Horizontal lines and symbols indicate significance of t tests across participants, that compared performance for the typical color with the average performance of the two other colors (*p < 0.05; ***p < 0.001; ns = nonsignificant). The higher the bars in Panels a, b, and c the higher category consistency, and the lower the bars in Panel d the higher color constancy (categorical pattern). Note that category consistencies (Panels ac), but not perceptual constancy (Panel d) showed categorical patterns.
Figure 9
 
Color categories and constancy. In all panels the x-axis lists the 12 stimuli, and the groups of bars refer to the four color categories (red, yellow, green, and blue). The center bars in each group correspond to the typical, the other bars to the boundary colors. Panels a, b, and c show category consensus (consistency of categorization across observers), intra-individual category consistency (consistency of categorization across repeated measurements), and category constancy (consistency of categorization across illuminations) in the supplementary naming task. Panel d illustrates perceptual color constancy performance in the asymmetric matching task by showing the deviation from the natural surface prediction (for other measures of color constancy see Figures S9, S10). In all panels, error bars show standard errors of mean. Horizontal lines and symbols indicate significance of t tests across participants, that compared performance for the typical color with the average performance of the two other colors (*p < 0.05; ***p < 0.001; ns = nonsignificant). The higher the bars in Panels a, b, and c the higher category consistency, and the lower the bars in Panel d the higher color constancy (categorical pattern). Note that category consistencies (Panels ac), but not perceptual constancy (Panel d) showed categorical patterns.
Figure 10
 
Predictions of target colors. Bars indicate the average Euclidean distance in CIELUV from a target color. “Same” refers to adjustments under the same illumination (control condition), where the target color is unambiguously defined. The other bars correspond with adjustments under different illuminations (constancy condition). “Ave” refers to the grand average of adjustments; for “Nat” and “Mun” the assumed target color is defined by the prediction based on the natural reflectances and the reflectances of Munsell chips, respectively. CR indicates average deviations from predictions through cone ratios. N reports the total number of colors (12 × 2 conditions). Error bars show standard errors of the mean across stimuli and symbols indicate significance of paired t tests across stimuli (***p < 0.001; ns = nonsignificant). Figure S11 in the Supplementary Material shows the same data, but calculated in CIELAB space. Note that deviations were similar for the predictions based on Munsell chips and for natural reflectances, and they were highest for predictions based on cone ratios.
Figure 10
 
Predictions of target colors. Bars indicate the average Euclidean distance in CIELUV from a target color. “Same” refers to adjustments under the same illumination (control condition), where the target color is unambiguously defined. The other bars correspond with adjustments under different illuminations (constancy condition). “Ave” refers to the grand average of adjustments; for “Nat” and “Mun” the assumed target color is defined by the prediction based on the natural reflectances and the reflectances of Munsell chips, respectively. CR indicates average deviations from predictions through cone ratios. N reports the total number of colors (12 × 2 conditions). Error bars show standard errors of the mean across stimuli and symbols indicate significance of paired t tests across stimuli (***p < 0.001; ns = nonsignificant). Figure S11 in the Supplementary Material shows the same data, but calculated in CIELAB space. Note that deviations were similar for the predictions based on Munsell chips and for natural reflectances, and they were highest for predictions based on cone ratios.
Table 1
 
CIE1931 chromaticity coordinates of simulated Munsell chips under target illumination. Notes: WP = white point; BG = background.
Table 1
 
CIE1931 chromaticity coordinates of simulated Munsell chips under target illumination. Notes: WP = white point; BG = background.
Supplement 1
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×