**Abstract**:

**Abstract**
**Color constancy is usually measured by achromatic setting, asymmetric matching, or color naming paradigms, whose results are interpreted in terms of indexes and models that arguably do not capture the full complexity of the phenomenon. Here we propose a new paradigm,** *chromatic setting*, which allows a more comprehensive characterization of color constancy through the measurement of multiple points in color space under immersive adaptation. We demonstrated its feasibility by assessing the consistency of subjects' responses over time. The paradigm was applied to two-dimensional (2-D) Mondrian stimuli under three different illuminants, and the results were used to fit a set of linear color constancy models. The use of multiple colors improved the precision of more complex linear models compared to the popular diagonal model computed from gray. Our results show that a diagonal plus translation matrix that models mechanisms other than cone gain might be best suited to explain the phenomenon. Additionally, we calculated a number of color constancy indices for several points in color space, and our results suggest that interrelations among colors are not as uniform as previously believed. To account for this variability, we developed a new structural color constancy index that takes into account the magnitude and orientation of the chromatic shift in addition to the interrelations among colors and memory effects.

*Bounding Cylinder*, represented by a red circle). These were gray, green, blue, purple, pink, red, brown, orange, and yellow (Berlin & Kay, 1991). The squares within the red circle in Figure 1 symbolize the colors selected during this first step, which we called

*reference session*. We termed these colors

*Selected Representatives*(SRs). In the second step, which we called

*regular session,*the same subjects were asked to reproduce these SRs under different conditions of background and illumination. The squares outside the red circle in Figure 1 correspond to these colors, and the arrow represents the change in adaptation state. Since the new paradigm can be seen as an extension of the achromatic setting paradigm to multiple colors, we named it

*Chromatic Setting*.

*ColorCal*colorimeter (Konica Minolta, Tokyo, Japan) and CRS software. We used the COLORLAB (Malo & Luque, 2002) toolbox to get the color space conversions needed. Subjects modified the test stimuli by navigating the CIELab color space using six different buttons, two for each color space dimension on a commercial gamepad. The reference white point was D65, luminance = 100 cd/m

^{2}.

^{2}were: 11.25, 14.54, 18.42, 22.93, 28.12, 34.05 and 40.75. Its mean was 22.66 cd/m

^{2}.

^{2}, mean = 25.11 cd/m

^{2}.

^{2}, mean = 24.35 cd/m

^{2}.

*D65*,

*greenish*and

*yellowish*), whose CIE

*xy*chromaticities are shown in Table 1. The luminance range in cd/m

^{2}for the illuminated stimuli was between 11.25 and 40.74 for the D65 illuminant; between 11.24 and 40.73 for the greenish illuminant, and between 11.20 and 40.56 for the yellowish illuminant. The mean values in cd/m

^{2}were 24.04, 23.7, and 24.37, respectively.

Illuminant | x | y |

D65 | 0.312 | 0.329 |

Greenish | 0.296 | 0.453 |

Yellowish | 0.453 | 0.434 |

*reference*,

*regular*, and

*repeatability tests*. Figure 2 shows the time sequence of the experiment. First there was a

*training*period followed by the

*reference session*, after which the main body of the experiment started. It consisted of nine

*regular sessions*and three interleaved repeatability tests (occurring at the beginning, halfway, and at the end of the regular sessions) whose aim was to track variations in subject's responses. Subjects completed all experiments in less than three weeks, and no more than two sessions per day were allowed. Details of the different sessions were as follows:

^{2}) followed by 180 s of adaptation to a Mondrian under the same simulated illumination to be used later in session. After that, subjects were prompted auditorily and visually (by a word written in black at the bottom of the screen) to the color category requested, and they manipulated the gamepad to either select or reproduce the colors according to their instructions. Each trial ended by pressing a “next trial” button on the gamepad which followed re-adaptation to a geometrically randomized version of the original Mondrian and illuminant for 10 s before proceeding to the next trial. There were 44 trials: In the first four, subjects were asked to produce “gray,” and in the following, they were asked to produce the other eight colors five times each in random order. Test patches occurred simultaneously at multiple random locations in the Mondrian and were adjusted by the observer with no time constraints. They were spatially distributed in a random manner in every trial with the aim of forcing subjects to average test locations, thus reducing local chromatic induction effects (Otazu, Parraga, & Vanrell, 2010; Shevell & Wei, 2000). The number of test patches was determined according to the following constraints: (a) the total area occupied by the test patches was between 4% and 7% of the display and (b) the pixel average chromaticity of the screen prior to illumination was equal to D65. This resulted in different number and sizes for the test patches in backgrounds Type 0 (where the pixel average was already neutral) and Type I and II backgrounds. As a consequence, the number of test patches followed a normal distribution around 25 (2.4

*SD*) for the Type 0 backgrounds and 4.1 (0.75

*SD*) for the Type I and II backgrounds.

*basic starting rule*(Brainard, 1998). In all other cases, the starting value of the test patches was randomly distributed around each subject's selected “gray.” To obtain a single measure of a SR color, we averaged its individual trials adjustments. Each trial lasted approximately 30 s and each session approximately 25 min.

*Tukey–Kramer*method (Hochberg & Tamhane, 1987), which returns a set of pairwise comparison results. For example, to assess the repeatability of subject's XO “red” settings we considered data from the first column (rows B, C, and D) in Figure 4. These consist of three groups of five points each in the three CIELab dimensions. We applied our tests to each dimension separately, obtaining the values of

*F*(2, 12) = 2.25 with

*p*= 0.15 for a*,

*F*(2, 12) = 0.77 with

*p*= 0.48 for b*, and

*F*(2, 12) = 18.8 with

*p*= 0.0002 for L*. After applying the Tukey–Kramer post-hoc comparison we obtained three sets (one for each dimension) of values showing whether the “red” measures in panels B, C, and D are significantly different from each other. To assess whether “red” was well remembered we computed in all CIELab dimensions, the percentage of cases that were significantly different (in the example, observer XO could remember “red” in 78% of the cases). We repeated this procedure for all color categories, obtaining an average of 15% significantly different measures for all subjects. There were 17% significantly different measurements for red, green, and orange, and less than 16% significantly different measurements for the other colors. The mean distance among chromatic settings within the same category was 1.79 ΔE* for all observers and categories considered (see Table 2 below).

Red | Green | Blue | Yellow | Neutral | Purple | Pink | Orange | Brown | Mean | |

First session (D65) | 2.04 | 1.82 | 1.64 | 1.54 | — | 1.39 | 2.29 | 1.96 | 1.66 | 1.79 |

Second session (Greenish) | 4.74 | 3.82 | 3.26 | 4.72 | 4.89 | 4.21 | 4.86 | 3.33 | 4.01 | 4.21 |

*variability*(

*δ*) within each group of five trials we computed the average CIELab ΔE* distance between each SR trial and the mean SR. As a white point for our calculations we used the corresponding chromaticity of each illuminant (see Table 1) at 100 cd/m

^{2}. Since there were differences in the dispersion of data around the mean depending on each subject and color category, we summarized

*δ*in Table 3 where each value corresponds to the average variability over illuminant-background combinations. The average

*δ*value was 2.09 ΔE* (1

*SD*) for the reference sessions and 4.60 ΔE* (2.06

*SD*) for regular sessions. The difference between these values is likely to result from the Bounding Cylinder. According to our estimations, the precision of our method is consistent with that of achromatic setting studies (Brainard, 1998), where accuracies between 4 and 5 ΔE* are common.

Red | Green | Blue | Yellow | Neutral | Purple | Pink | Orange | Brown | Mean | |

JRV | 2.14 | 3.82 | 3.44 | 3.42 | 2.35 | 3.87 | 3.26 | 3.42 | 2.60 | 3.18 |

CAP | 4.40 | 3.56 | 3.53 | 3.45 | 3.81 | 5.96 | 4.66 | 5.75 | 4.96 | 4.45 |

MV | 3.51 | 4.31 | 5.60 | 5.10 | 2.91 | 6.62 | 4.66 | 5.25 | 4.34 | 4.70 |

MS | 2.44 | 4.05 | 5.82 | 4.65 | 3.39 | 3.78 | 5.43 | 3.37 | 2.78 | 3.97 |

XO | 2.98 | 3.87 | 3.33 | 3.28 | 3.00 | 4.38 | 3.27 | 4.02 | 3.22 | 3.48 |

RB | 3.65 | 3.51 | 3.61 | 3.42 | 7.39 | 4.44 | 3.62 | 4.71 | 6.70 | 4.56 |

LC | 5.94 | 6.55 | 5.06 | 7.61 | 4.71 | 5.76 | 5.87 | 4.85 | 7.11 | 5.94 |

AB | 4.29 | 5.66 | 5.23 | 4.34 | 5.09 | 6.55 | 5.55 | 5.21 | 4.47 | 5.15 |

RBV | 5.17 | 5.20 | 4.93 | 5.06 | 3.08 | 4.87 | 4.94 | 6.71 | 5.14 | 5.01 |

JC | 5.54 | 4.45 | 5.67 | 4.94 | 4.81 | 6.63 | 6.19 | 6.92 | 4.79 | 5.55 |

Mean | 4.03 | 4.50 | 4.62 | 4.53 | 4.05 | 5.29 | 4.74 | 5.02 | 4.61 | 4.60 |

*δ*for each color category. Some color categories such as red (mean = 4.03, 1.25

*SD*) and gray (mean = 4.05, 1.49

*SD*) have in average a smaller

*δ*value than others, e.g., purple (mean = 5.29, 1.15

*SD*) and orange (mean = 5.02, 1.22

*SD*). These tendencies are similar across background types. However, different illuminants arguably influenced the

*δ*value of our measures: D65 illuminant has the lowest

*δ*value (mean = 3.83, 1.52

*SD*), followed by greenish (mean = 4.81, 2.08

*SD*), and yellowish (mean = 5.16, 2.27

*SD*) illuminants.

*SD*) for the reference sessions and 20.7 s (6.2

*SD*) for the regular sessions. Gray took the longest to adjust (mean = 25.1, 7.6

*SD*), followed by brown (mean = 22.1, 5.6

*SD*), which in turn took longer than blue (mean = 18.6, 5.7

*SD*), purple (mean = 18.7, 4.5

*SD*), and pink (mean = 17.9, 4.7

*SD*). Red (mean = 21.1, 7.1

*SD*) and yellow (mean = 20.7, 6.2

*SD*) took longer time than pink, which was the fastest to adjust.

*Constancy Index*(CI; Arend, Reeves, Schirillo, & Goldstein, 1991), the

*Color Constancy Index*(CCI; Ling & Hurlbert, 2008) and the

*Brunswick ratio*(BR; Hansen et al., 2007; Smithson & Zaidi, 2004; Yang & Shevell, 2002), which takes into account the adaptation under the reference illumination. Equation 1 shows an example of how this was implemented for the case of BR.

*c*under illumination

*i*(1 corresponds to D65, 2 to greenish, and 3 to yellowish). Also, $bci$ are the chromaticity coordinates of the corresponding $ac1$ when the illuminant

*i*was applied. The numerator computes the perceptual shift, i.e., the difference between SRs chosen under D65 illuminant and greenish/yellowish illuminants. The denominator computes physical shift, i.e., the difference between SRs chosen under D65 and their chromatic coordinates when illuminated by greenish/yellowish illuminants. Following this arrangement, a value of one indicates perfect color constancy and zero no color constancy.

Category/Index | CI | BR ¯ | CCI | Mean | |||

Category/Illuminant | Greenish | Yellowish | Greenish | Yellowish | Greenish | Yellowish | |

Red | 0.37 | 0.63 | 0.69 | 0.65 | 0.82 | 0.76 | 0.65 |

Green | 0.73 | 0.68 | 0.61 | 0.58 | 0.89 | 0.88 | 0.73 |

Blue | 0.53 | 0.55 | 0.64 | 0.65 | 0.68 | 0.68 | 0.62 |

Yellow | 0.71 | 0.76 | 0.51 | 0.49 | 0.72 | 0.75 | 0.66 |

Gray | 0.55 | 0.56 | 0.62 | 0.63 | 0.61 | 0.62 | 0.60 |

Purple | 0.49 | 0.58 | 0.72 | 0.78 | 0.77 | 0.79 | 0.69 |

Pink | 0.55 | 0.64 | 0.54 | 0.58 | 0.64 | 0.68 | 0.60 |

Orange | 0.62 | 0.75 | 0.53 | 0.51 | 0.73 | 0.76 | 0.65 |

Brown | 0.50 | 0.70 | 0.75 | 0.57 | 0.96 | 0.82 | 0.72 |

Mean | 0.56 | 0.65 | 0.62 | 0.60 | 0.76 | 0.75 | 0.66 |

**x**and

**y**are the LMS cone excitations produced by the light reaching the observer from the CRT monitor:

**x**corresponds to the reference illuminant (D65) and

**y**corresponds to the test illuminant (greenish or yellowish).

**M,**which can take one of several possible forms according to its nonzero coefficients. These can also be understood in terms of models of visual mechanisms:

*m*

_{i}_{,j}= 0 if

*i*≠

*j*) has only three free parameters. This model only allows for multiplicative gain changes that are specific to each one of the three cone classes. It is often referred as Von Kries adaptation (Brainard & Wandell, 1992; Von Kries, 1905/1970).

*m*

_{i}_{,j}= 0 if

*j*= 4) has nine free parameters. This model allows signals from each cone type to be modulated independently and can describe multiplicative gain changes both at the receptor level and after an opponent transformation (Brainard & Wandell, 1992).

**M**include the linear model and the fourth column represents an additive process. This model can be thought as an instance of the two-process model proposed by Jameson and Hurvich (1964) (see also Brainard & Wandell, 1992).

*m*

_{i}_{,j}= 0 if

*i*≠

*j*and

*j*< 4) has six free parameters and can be seen as a simplification of the affine model. The first three columns allow only for multiplicative gains for each cone class and the last column allows a further additive process.

**X**contains the LMS coordinates of

*n*colors

**x**

_{i}under reference illuminant and matrix

**Y**contains the settings of those same colors,

**y**

_{i}, under test illuminant.

*N*predictions and the data points. The function to minimize is described by Equation 6, where

*φ*is an operator that translates from LMS to CIELab coordinates.

*F*was minimized using the Matlab Optimization Toolbox. Model precision was evaluated by computing the average ΔE* difference between the whole set of nine chromatic settings and their predictions computed from the matrix

_{N}**M**.

*N*≥ 1, diagonal plus translation admits

*N*≥ 2 data points, linear admits

*N*≥ 3 data points, and affine

*N*≥ 4 data points. This is also valid for Equation 6.

*y*-axis shows the prediction error (in ΔE* units) associated with each model as a function of the number of chromatic settings used to fit it. Following the approach of Brainard et al. (1997), we used the chromaticity coordinates of the corresponding illuminant as a reference white point in each case. The function specified in Equation 6 was minimized to fit chromatic settings

**x**(corresponding to D65) and

**y**(corresponding to greenish or yellowish illuminants) keeping the same background type. Take for instance panel A in Figure 8, where each point is the average model prediction error from all possible combinations of elements of ℋ that contain the number of colors specified in the

*x*-axis, across backgrounds and subjects. Consider the case when the nine SRs were measured both under D65 and greenish illumination using the same background type. We fitted the diagonal model to only one correspondence pair from the nine chromatic settings available and used the same parameters to predict the positions of all nine corresponding pairs. We repeated this for all the other pairs and calculated the average CIELab ΔE* distance between predicted and measured points for the nine chromatic settings pairs. We extended this to all subjects and backgrounds. The result of these calculations (average from 270 model predictions) is shown in panel A as the leftmost filled circle in the plot. To calculate the second leftmost circle in the plot, we fitted the diagonal model to two correspondence pairs from the nine chromatic settings available and predicted the positions of all nine pairs (36 possible combinations). This point represents the average across subjects and backgrounds (1,080 model predictions). The other circles were calculated similarly by fitting the diagonal model to increasingly more data points. The same reasoning was applied to the other models, shown as triangles and squares in Figure 8. Since the results of the minimization process in Equation 6 depend on the initial seed, we used 100 random seeds (for larger values results tend to stabilize) and the solution to the linear system specified by Equation 5 (Brainard & Wandell, 1992) as a complementary seed. We selected the minimum optimization value of all seeds.

*Akaike Information Criterion*(Burnham & Anderson, 2002). This criterion measures the relative goodness of fit of a model in terms of the information lost when it is used to describe data (see Appendix B). The results show that the best models in Figure 8 are the simplest: Diagonal and Diagonal plus Translation, implying that the Linear and the Affine models are possibly over-fitting the data. In particular,

*Akaike weights*, which indicate the plausibility for each model being the best are equal to zero for the Linear and Affine. The results also show a clear tendency for the Diagonal plus Translation to become the best in terms of number of free parameters and prediction error as we add more data points (

*Akaike weights*increase with increasing number of data points for the Diagonal plus Translation model).

*δ*), adjustment time, and constancy index and to summarize the behavior of the whole set of chromatic settings. Interestingly, we have found that subject's ability to adjust gray and red are similar, closely followed by many other categories. Also, gray is the color that takes longer time to adjust, maybe because subjects can discriminate more finely near the achromatic locus (Boynton & Olson, 1987). Furthermore, we expected color constancy indices for gray to be near the average, and Table 4 shows that they are generally low, and in the case of the CCI index, the lowest. Previous work found higher color constancy for gray than for chromatic stimuli (Olkkonen et al., 2010; Speigle & Brainard, 1999), which is perhaps due to the fact that we used simulated surfaces and illuminants instead of real surfaces. We also found high color constancy values (0.66 in average), which is in accordance to similar studies (Foster, 2011; Hansen et al., 2007; Ling & Hurlbert, 2008; Murray, Daugirdiene, Vaitkevicius, Kulikowski, & Stanikunas, 2006; Olkkonen et al., 2009; Olkkonen et al., 2010), a fact that is supported by visual inspection of the plots in Figure 6, where interdistances among measured colors are largely preserved. This supports the finding that the categorical structure of color space is largely preserved under illuminant changes (Hansen et al., 2007; Olkkonen et al., 2009; Olkkonen et al., 2010).

*magnitude*) among the colors of the test surface, the ideal match and the observer match. Examples of these are the CI (Arend et al., 1991), the BR (Troost & de Weert, 1991), and the BR

*ϕ*, which incorporates the direction (

*orientation*) between the perceptual and physical color shifts (Foster, 2011). Several improvements have been suggested. For instance, Ling and Hurlbert (2008) proposed a new index CCI that incorporates the matching error in the absence of illumination change (

*memory*shift) and Brainard (1998) proposed to use the Equivalent Illuminant (EI) instead of the measured adaptation point, which is calculated from different measured points and thus captures the inter-distances among the colors considered under a given adaptation state (

*structural*).

*tructural Constancy Index*(SCI), which captures all the features stated in Table 5. The new index is defined in terms of matrix norms, which are extensions of the notion of vector norms applied to matrices. As Equation 7 shows, the norm of a matrix

**A**is obtained from the norm of vectors

**x**and

**Ax**and describes the maximum relative vector magnitude change under the linear transformation

**A**.

Property/Index | CI, BR, BR¯ | EI | CCI | SCI |

Magnitude | Yes | Yes | Yes | Yes |

Orientation | No | No | Yes | Yes |

Memory | No | Yes | Yes | Yes |

Structure | No | Yes | No | Yes |

**A**and

_{percep}**A**to be affine matrices, with the last column of

_{phys}**A**specifying the translation vector

_{percep}**r**and the last column of

**A**specifying the translation vector

_{phys}**s**. Notice that here we are not using an affine matrix to model the data as in previous sections, but to quantify two different aspects of color constancy: magnitude and orientation.

**A**are determined from pairs of corresponding chromatic settings under reference and test illuminants and can be obtained following the approach described in the modeling subsection above (Equation 3). Likewise, the coefficients of

_{percep}**A**are determined from correspondences between the chromatic settings made under the reference illuminant and physical simulations of the same colors under a test illuminant. In this formulation, if matrices

_{phys}**A**and

_{percep}**A**are equal, then color constancy is perfect. Finally, memory effects like those discussed by Ling and Hurlbert (2008) are neutralized since our measurements were obtained from direct comparisons under reference and test illuminants.

_{phys}**A**and

_{percep}**A**. The latter case happens when observers correct for the illuminant more than they should. Panel B describes the contribution of the second term of Equation 8, i.e., a weighting factor to penalize for angular deviations from the direction of the simulated illuminant shift. As

_{phys}**r**and

**s**become more perpendicular, their product

**rs**becomes closer to zero. Although negative values are possible in theory, in practice this weighting factor should be positive assuming that

**r**and

**s**are far from perpendicular. Structural information of the color constancy phenomenon is implicitly embedded in the affine matrix. Other indices such as BR would produce the same value for all hypothetical settings located around the half circumference defined by the broken line, since it only compares the magnitudes of both shifts. For the hypothetical cases described by

**s**

_{1}and

**s**

_{3}, CI would have a value of one, since it compares the magnitude of the vector defined by the settings and their expected location to the magnitude of the illuminant shift. Panels C and D illustrate how structural information is summarized into a single positive number. Popular indices such as CI, BR, and CCI do not convey this information since they are usually computed over the achromatic setting. Panel C illustrates the case when there is no translation (i.e., the last column of the affine matrix is null) and the matrix can be interpreted in terms of expansion (‖

*A*‖

_{2}> 1), retraction (‖

*A*‖

_{2}< 1), or rotation (‖

*A*‖

_{2}= 1). Panel D illustrates the case when only the translation part is operative and the value of the norm reflects this translation. Panel E shows an exemplary case when the spatial relationships among measurements are disrupted by just one chromatic setting

*outlier*. This structural disruption would be embedded in the affine matrix, which represents a compromise solution in between the prediction error of the outlier and the rest of measurements. Notice, that in this particular example there is only one outlier but there could be more, with effects such as those described in Panel C (contraction, expansion, and rotation), leading to more complex outcomes. As Table 4 shows, each index produces a different value for a different color category; SCI deals with this variability by summarizing the measurements for all categories into an affine matrix. This casuistic is illustrated by panel E whose chromatic setting outlier would have produced values of CI, BR, or SCI notably different from the rest, thus making the quantification of color constancy dependent on the selected color category.

**A**is slightly larger than the norm of

_{percep}**A**, making the first term of Equation 8 slightly larger than one. The previous analysis implies that perfect color constancy is achieved when SCI is equal to one and different values indicate either lack of constancy (SCI <1) or overcompensation (SCI >1). In our case, we expected values close to one due to the long adaptation period of immersive illumination.

_{phys}Index/Illuminant | Greenish | Yellowish |

BR ¯ | 0.62 | 0.61 |

EI | 0.58 | 0.59 |

CCI | 0.76 | 0.75 |

SCI | 1.03 | 0.85 |

*magnitude*and

*orientation*contributions revealed that these differences originated in the norm of the perceptual matrix as explained in Panels C, D, and E of Figure 10. In the previous modeling subsection, we found lower prediction errors for the greenish illuminant (see Figure 8), indicating that such data is better captured by the fitting of linear models, a process similar to the computation of SCI values. This explains why chromatic settings under yellowish illuminant have a higher degree of dispersion when compared to chromatic settings under D65 than in the greenish case. These differences manifest in Figure 6 as subtle variations in the location of the yellow, orange, brown, red, and pink data points, which may account for the 18% difference between both illuminants in Table 6. We could hypothesize about the origin of this dispersion and say that greenish-illuminated colors fall inside the broad green category, whereas yellowish-illuminated colors fall into several categories and this initial (first milliseconds) categorical perception may influence the subject's adaptation and subsequent chromatic settings. However, this needs to be settled by doing more experiments in the future.

*Chromatic Setting*) to study color constancy, which measures several points in color space under extended periods of adaptation to the illumination. We have shown the paradigm to be feasible in terms of memory and consistency of subject's responses over time. No remarkable differences were found between the role played by gray and the rest of the chromatic categories tested for this task. Our results show that linear models, in particular the Diagonal plus Translation, succeed in capturing the color constancy phenomenon. They also show that including more colors does improve model precision. A quantification of the phenomenon in terms of commonly used color constancy indices reveals substantial differences when applied to individual colors. In addition to our paradigm, we developed a more comprehensive color constancy index (the

*Structural Constancy Index*), which accounts for changes in magnitude, orientation, and structure, as well as memory effects. When applied to our measures, our index indicates nearly full constancy for the greenish illuminant and slightly less constancy for the yellowish illuminants tested. Our results do not show any quantitative difference regarding the types of colored background tested.

*Generalitat de Catalunya*.

*, 3(10), 1743–1751. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 8(4), 661–672. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 38(Suppl.), 36.*

*Perception**. Berkeley, CA: University of California Press. (Original work published 1969.)*

*Basic color terms: Their universality and evolution**, 12(2), 94–105. [CrossRef]*

*Color Research & Application**, 15(2), 307–325. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 14(9), 2091–2110. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 3(10), 1651–1661. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 9(9), 1433–1448. [CrossRef]*

*Journal of the Optical Society of America A: -Optics, Image Science, & Vision**, 7(11), 844–849. [CrossRef] [PubMed]*

*Current Biology**, 310(1), 1–26. [CrossRef]*

*Journal of the Franklin Institute: Engineering & Applied Mathematics**. New York: Springer.*

*Model selection and multimodel inference: A practical information-theoretic approach*(2nd ed.)*, 47(1), 35–42. [CrossRef]*

*Journal of the Optical Society of America**, 39(20), 3444–3458. [CrossRef] [PubMed]*

*Vision Research**, 4(9):8, 764–778, http://www.journalofvision.org/content/4/9/8, doi:10.1167/4.9.8. [PubMed] [Article] [CrossRef]*

*Journal of Vision**The Farnsworth-Munsell 100 hue test for the examination of colour discrimination*. Baltimore, MD: Munsell Color Co. Inc.

*, 225(1), 155–170. [CrossRef]*

*Monthly Notices of the Royal Astronomical Society**, 7(10), 439–443. [CrossRef] [PubMed]*

*Trends in Cognitive Sciences**, 257(1349), 115–121. [CrossRef]*

*Proceedings of the Royal Society of London B**, 415(6872), 637–640. [CrossRef] [PubMed]*

*Nature**, 9(11), 1367–1368. [CrossRef] [PubMed]*

*Nature Neuroscience**, 7(4):2, 1–15, http://www.journalofvision.org/content/7/4/2, doi:10.1167/7.4.2. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision**, 93(1), 10–20. [CrossRef] [PubMed]*

*Journal of Experimental Psychology**. New York: Wiley.*

*Multiple comparison procedures**, 31(4), 303–314. [CrossRef]*

*Color Research & Application**. Tokyo, Kyoto: Kannehara Shuppan Co., Ltd.*

*Tests for colour-blindness**, 45(7), 546–552. [CrossRef]*

*Journal of the Optical Society of America**, 4(1–2), 135–154. [CrossRef] [PubMed]*

*Vision Research**, 13(10), 1981–1991. [CrossRef]*

*Journal of the Optical Society of America A**, 106(20), 8140–8145. [PubMed]*

*Proceedings of the National Academy of Sciences of the USA**, 97(1), 25–35. [CrossRef]*

*Acta Psychologica*(*Amst*)*, 52, 247–264.*

*American Scientist**, 61(1), 1–11. [CrossRef] [PubMed]*

*Journal of the Optical Society of America**, 25(6), 1215–1226. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 25, 2918–2924. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 3(1), 29–33. [CrossRef]*

*Journal of the Optical Society of America A: Optics, Image Science, & Vision**, 46(19), 3067–3078. [CrossRef] [PubMed]*

*Vision Research**, 30(5), 594–601. [CrossRef]*

*Ophthalmic & Physiological Optics**, 9(12):6, 1–18, http://www.journalofvision.org/content/9/12/6, doi:10.1167/9.12.6. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision**, 10(9):16, 1–22, http://www.journalofvision.org/content/10/9/16, doi:10.1167/10.9.16. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision**, 10(12):5, 1–24, http://www.journalofvision.org/content/10/12/5, doi:10.1167/10.12.5. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision**, 202(2), 615–627. [CrossRef]*

*Monthly Notices of the Royal Astronomical Society**, 23, 52–54. [CrossRef]*

*Color Research & Application**, 6(10):10, 1102–1116, http://www.journalofvision.org/content/6/10/10, doi:10.1167/6.10.10. [PubMed] [Article] [CrossRef]*

*Journal of Vision**, 40(23), 3173–3180. [CrossRef] [PubMed]*

*Vision Research**, 15, 161–171. [CrossRef] [PubMed]*

*Vision Research**, 4(9):3, 693–710, http://www.journalofvision.org/content/4/9/3, doi:10.1167/4.9.3. [PubMed] [Article] [CrossRef]*

*Journal of Vision**, 360(1458), 1329–1346. [CrossRef]*

*Philosophical Transactions of the Royal Society B: Biological Sciences**, pp. 167–172. Springfield, VA: Society for Imaging Science and Technology.*

*Proceedings of the 4th Information Science and Technology/Society for Information Display Color Imaging Conference: Color Science, Systems, and Applications*

*Journal of the Optical Society of America A: -Optics, Image Science, & Vision**,*16(10), 2370–2376. [CrossRef]

*, 50(6), 591–602. [CrossRef] [PubMed]*

*Perception & Psychophysics**, pp. 145–148. Cambridge, MA: MIT Press. (Originally published in Festschrift der Albrecht-Ludwigs-Universitat [1902]).*

*Sources of Color Vision**(pp. 29–54). Philadelphia: J. Benjamins Publishing Co.*

*Anthropology of color: Interdisciplinary multilevel modeling**. Chichester, New York: Wiley.*

*Color science: Concepts and methods, quantitative data and formulae*(2nd ed.)*, 42(16), 1979–1989. [CrossRef] [PubMed]*

*Vision Research**Akaike Information Criterion*(AIC) method is based on Information Theory, and it is widely used for model selection, i.e., given several candidate models the method selects the model, which minimizes the loss of information when approximating the reality. In order to test our four models we used the AIC version adapted to small sets of samples (

*AIC*) and the residual sum of squares (

_{c}*RSS*) as detailed in Equation 9, where

*n*corresponds to the number of data points and

*k*to the number of variables plus the error term (Burnham & Anderson, 2002).

_{c}formulae depends exclusively on the dimensions of the multivariate system resulting from Equation 5, because this approach does not reflect the number of free parameters existent in our tested models we rearranged the system into an equivalent univariate system. In order to apply the AIC

_{c}we assumed that our prediction errors followed a normal distribution.

*=*

_{i}*AIC*– min(

_{i}*AIC*)) and (b) the

_{i}*Akaike weights*, which quantify the plausibility of each model as being the best (

*w*= exp(−0.5Δ

_{i}*)/$\u2211r=1R$exp(−0.5Δ*

_{i}*)). As a rule of the thumb, a Δ*

_{r}*< 2 suggests substantial evidence for the model, values between 3 and 7 indicate that the model has considerably less support, whereas Δ*

_{i}*> 10 indicate that the model is very unlikely (Burnham & Anderson, 2002).*

_{i}_{c}, Δ

_{i}and

*w*

_{i}when applied to our data according to the model, number of fitting points and illumination used. Notice that the reported RSS values do not correspond to the minimization ones in Figure 6; this is because we took as RSS value the accumulative error of the fitting points that participated in the minimization process only. In practice, RSS values were not obtained by linear regression but from the minimization process described in Equation 6; however, the target value of the minimization is equivalent. Also the RSS values used in Table B.1 resulted from the average over all subjects and backgrounds.

_{i}and

*w*

_{i}values in Table 6 indicate that the Diagonal and Diagonal plus Translation models are the ones that best model the data, and indicate that the Linear and Affine models significantly over-fit the data. The small differences in Δ

_{i}between D and DT are not conclusive about which is the best model; however, there is a clear tendency as we add more fitting points; the DT model becomes better than D. From one to four fitting points the AIC indicates that the best model is the D, DT, L, and A as expected due to the coincidence between the number of fitting points and the free parameters.

Model | 3n | k | Greenish | Yellowish | ||||||

RSS | AIC _{c} | Δ_{i} | w _{i} | RSS | AIC _{c} | Δ_{i} | w _{i} | |||

D | 15 | 4 | 293.18 | 56.59 | 0 | 0.96 | 535.1 | 65.62 | 0 | 0.99 |

DT | 15 | 7 | 134.79 | 62.93 | 6.34 | 0.04 | 298.2 | 74.84 | 9.23 | 0.01 |

L | 15 | 10 | 109.74 | 104.85 | 48.26 | 0 | 218.6 | 115.18 | 49.57 | 0 |

A | 15 | 13 | 40.96 | 405.01 | 348.48 | 0 | 81.1 | 415.31 | 349.69 | 0 |

D | 18 | 4 | 367.05 | 65.35 | 0 | 0.77 | 671.6 | 76.22 | 0 | 0.95 |

DT | 18 | 7 | 187.90 | 67.42 | 2.07 | 0.26 | 425.9 | 82.15 | 5.92 | 0.05 |

L | 18 | 10 | 164.36 | 91.24 | 25.89 | 0 | 328.4 | 103.70 | 27.47 | 0 |

A | 18 | 13 | 82.27 | 144.36 | 79.00 | 0 | 170.5 | 157.47 | 81.25 | 0 |

D | 21 | 4 | 440.95 | 74.43 | 0.44 | 0.45 | 808.3 | 87.16 | 0 | 0.88 |

DT | 21 | 7 | 242.55 | 74.00 | 0 | 0.55 | 547.2 | 91.08 | 3.92 | 0.12 |

L | 21 | 10 | 218.61 | 91.20 | 17.20 | 0 | 434.7 | 105.63 | 18.47 | 0 |

A | 21 | 13 | 123.84 | 115.26 | 41.27 | 0 | 270.1 | 131.63 | 44.48 | 0 |

D | 24 | 4 | 514.87 | 83.69 | 2.27 | 0.24 | 945 | 98.26 | 0 | 0.83 |

DT | 24 | 7 | 297.47 | 81.41 | 0 | 0.76 | 685 | 101.45 | 3.19 | 0.17 |

L | 24 | 10 | 273.25 | 95.30 | 13.88 | 0 | 550 | 112.07 | 13.81 | 0 |

A | 24 | 13 | 169.47 | 109.31 | 27.90 | 0 | 381.9 | 128.81 | 30.55 | 0 |

D | 27 | 4 | 588.79 | 93.04 | 3.72 | 0.13 | 1081.8 | 109.46 | 0 | 0.70 |

DT | 27 | 7 | 353.17 | 89.31 | 0 | 0.86 | 794 | 111.19 | 1.72 | 0.29 |

L | 27 | 10 | 327.65 | 101.14 | 11.83 | 0 | 617 | 118.23 | 8.77 | 0.01 |

A | 27 | 13 | 226.17 | 111.39 | 22.07 | 0 | 495.3 | 132.55 | 23.10 | 0 |

Index/Illuminant | New observers (2) | Original observers (10) | ||

Greenish | Yellowish | Greenish | Yellowish | |

BR ¯ | 0.70 | 0.80 | 0.62 | 0.61 |

CCI | 0.74 | 0.87 | 0.76 | 0.75 |

SCI | 1.15 | 0.84 | 1.03 | 0.85 |