The ability to perceptually identify distinct surfaces in natural scenes by virtue of their color depends not only on the relative frequency of surface colors but also on the probabilistic nature of observer judgments. Previous methods of estimating the number of discriminable surface colors, whether based on theoretical color gamuts or recorded from real scenes, have taken a deterministic approach. Thus, a three-dimensional representation of the gamut of colors is divided into elementary cells or points which are spaced at one discrimination-threshold unit intervals and which are then counted. In this study, information-theoretic methods were used to take into account both differing surface-color frequencies and observer response uncertainty. Spectral radiances were calculated from 50 hyperspectral images of natural scenes and were represented in a perceptually almost uniform color space. The average number of perceptually distinct surface colors was estimated as 7.3 × 10^{3}, much smaller than that based on counting methods. This number is also much smaller than the number of distinct points in a scene that are, in principle, available for reliable identification under illuminant changes, suggesting that color constancy, or the lack of it, does not generally determine the limit on the use of color for surface identification.

*L**,

*a**,

*b**) of CIELAB color space do not define a strictly uniform color space, that is, the Euclidean distance Δ

*E** = [(Δ

*L**)

^{2}+ (Δ

*a**)

^{2}+ (Δ

*b**)

^{2}]

^{1/2}between points does not correspond to a constant perceptual color difference, Linhares et al. calculated the separations between points using the CIEDE2000 color-difference formula (CIE, 2004a), which largely compensates for the non-uniformity of CIELAB. The number of discriminable colors was estimated by iteratively discarding points that differed from each other by less than one half the nominal CIEDE2000 threshold color difference Δ

*E*

_{00}= 0.6. The procedure was repeated until there were no points left. Although based on a smaller population than the theoretical object-color solid used by Martínez-Verdú et al. (2007) and Pointer and Attridge (1998), the resulting estimate was almost the same, a total of about 2.3 million colors (Linhares et al., 2008, Table 1), accumulated over the 50 scenes. The estimated number of discriminable colors averaged over individual scenes was inevitably smaller, about 2.7 × 10

^{5}(Linhares et al., 2008, Table 1).

^{3}, markedly smaller than that based on counting methods. An implication of this result for estimates of color constancy in natural scenes is briefly considered in the Discussion section.

*L**,

*a**,

*b**) coordinates, and a color-difference formula such as CIEDE2000 (Luo, Cui, & Rigg, 2001) or DIN99 (Cui, Luo, Rigg, Roesler, & Witt, 2002) can then be used to correct for non-uniformities (Linhares et al., 2008). Instead, to simplify the analysis, the color of the reflected light was expressed in CIECAM02 space (CIE, 2004b), which has the advantage that perceived color differences represented as Euclidean differences in CIECAM02 coordinates (

*J*,

*a*

_{C},

*b*

_{C}) correspond to almost constant perceptual color differences (Luo, Cui, & Li, 2006; Melgosa, Huertas, & Berns, 2008). Since CIECAM02 is determined empirically and has a built-in chromatic-adaptation transform (CIE, 2004b), it automatically incorporates any improvements in discrimination performance found in the region of the reference white. The variable

*J*represents lightness and

*a*

_{C}and

*b*

_{C}represent the projections of chroma onto the red–green and blue–yellow hue axes, giving a hue angle

*h*= tan

^{−1}(

*b*

_{C}/

*a*

_{C}). Along with several other color spaces, CIECAM02 may depart from uniformity with very small color differences, where CIELAB Δ

*E** ≤ 1 (Melgosa, Huertas, & García, 2008).

*J*,

*a*

_{C},

*b*

_{C}) may be treated as an instance

*u*of a trivariate continuous random variable

*U*with probability density function (pdf)

*f*

_{ U }, say. Likewise, the observer's perceptual response may be treated as an instance

*v*of a second trivariate continuous random variable

*V*with pdf

*f*

_{ V }, say. Given two particular triplets

*u*and

*u*′, the probability that an observer reports them as being different depends on the corresponding difference

*v*−

*v*′. This dependence is described by the psychometric function

*Ψ*, say. The differences

*w*=

*v*−

*u*and

*w*′ =

*v*′ −

*u*′ are, in turn, instances of a trivariate continuous random variable

*W*with pdf

*f*

_{ W }, where

*V*=

*U*+

*W*. The pdf

*f*

_{ W }is essentially the derivative of the psychometric function

*Ψ*(DeCarlo, 1998).

*U*,

*V*, and

*W*may be quantified by the differential entropy (Cover & Thomas, 1991). For example, the differential entropy

*h*(

*U*) of

*U*is given by

*h*should not be confused with that for hue in CIECAM02. When the logarithm is taken to the base 2, the differential entropy is given in bits. Shannon's mutual information

*I*(

*U*;

*V*) between

*U*and

*V*is given (Cover & Thomas, 1991) by

*I*(

*U*;

*V*) measures the amount of information that the perceived color provides about the sampled surface colors. It has the useful interpretation as the mean number

*N*of perceived surface colors that can each be identified with a distinct surface color in the scene by virtue of the following relationship (Cover & Thomas, 1991; Shannon, 1948a, 1948b):

*h*(

*V*) and

*h*(

*W*) may be positive, negative, or zero, depending, not least, on the scale of measurement defined by CIECAM02, but the mutual information

*I*(

*U*;

*V*) is always positive and independent of the scale.

*r*(

*λ*;

*x*,

*y*) at each wavelength

*λ*and position (

*x*,

*y*). The reflectances

*r*(

*λ*;

*x*,

*y*) were obtained by dividing the spectral radiance of the image by the spectral radiance of a small neutral (Munsell N5 or N7) reference surface embedded in the scene and then multiplying by the known spectral reflectance of the neutral surface. The effect of a particular daylight was simulated by multiplying

*r*(

*λ*;

*x*,

*y*) at each point (

*x*,

*y*) by a standard illuminant spectrum, here a daylight with correlated color temperature of 6500 K, fixed over all scenes so that the color gamut of each scene was not confounded by differences in illuminant. The raw spectral radiance images were actually acquired under daylights with correlated color temperatures ranging from about 4400 K to 8200 K.

*J*,

*a*

_{C},

*b*

_{C}) were calculated at each pixel in each image of each scene according to the CIECAM02 specification with default values, including those for chromatic adaptation (CIE, 2004b). Integrations were performed numerically over 400–720 nm with the given 10-nm sampling interval.

*h*(

*V*) were not feasible as the values of

*V*are not directly available, but from numerical simulations, it was found that

*h*(

*U*) and

*h*(

*V*) were almost equal, with

*h*(

*V*) on average about 0.1 bits larger than

*h*(

*U*). Hence, in Equation 3, if

*h*(

*V*) is replaced by

*h*(

*U*), and if

*U*) and

*W*) are the corresponding estimates of

*h*(

*U*) and

*h*(

*W*), then

*I*(

*U*;

*V*) may be estimated by

*U*) is to find an estimate of the pdf

*f*

_{ U }and plug the result into Equation 1. Naïve estimates of

*f*

_{ U }may be obtained by binning, i.e., partitioning the space of triplets (

*J*,

*a*

_{C},

*b*

_{C}) from each scene into a finite number of cells (different from the cells of the counting methods described earlier) and counting the frequency of occurrences in each cell (Silverman, 1986). In practice, the number of cells needed for an accurate estimate of the pdf is generally large and the estimate of the frequency of responses will be biased unless the sample size is much larger still than the number of cells. If the sample is not large enough, systematic errors in estimating the entropy can occur (Kraskov, Stögbauer, & Grassberger, 2004). These errors may be minimized by introducing bias-correction terms, but the accuracy of the estimates still depends on binning.

*k*-nearest-neighbor statistics to obtain an estimate

*U*) of the differential entropy

*h*(

*U*). The

*k*-nearest-neighbor estimator due to Kozachenko and Leonenko (Goria, Leonenko, Mergel, & Novi Inverardi, 2005; Kozachenko & Leonenko, 1987) was chosen since it provides results that are efficient, adaptive, and have minimal bias (Kraskov et al., 2004). It was applied to the triplets (

*J*,

*a*

_{C},

*b*

_{C}) from each scene in an offset form that converges more rapidly and accurately than the original estimator (Foster, Marín-Franch, Amano, & Nascimento, 2009; Marín-Franch, 2009).

*W*) of the differential entropy

*h*(

*W*) associated with observer response uncertainty was obtained from the reported perceptibility of small color differences between matt paint samples (Wang, Luo, Cui, & Xu, 2009). The samples consisted of 10 reference samples and, for each reference, 30 test samples. The recorded proportions of perceptible differences ranged from 0.125 to 1.0, and, at each level, estimates were made of the differences (Δ

*J*, Δ

*a*

_{C}, Δ

*b*

_{C}) in (

*J*,

*a*

_{C},

*b*

_{C}) values. The resulting sample sizes (300) were, however, too small to find a reliable estimate of the differential entropy

*h*(

*W*) by any of the empirical methods mentioned earlier. Fortunately, as the distribution of the differences Δ

*J*, Δ

*a*

_{C}, and Δ

*b*

_{C}were each approximately Gaussian, the differential entropy

*h*(

*W*) could be estimated by the differential entropy of a trivariate Gaussian variable; that is,

*K*∣ is the determinant of the covariance of

*W*, which was approximately the product of the individual variances

*σ*

_{ J }

^{2}

*σ*

_{ a }

^{2}

*σ*

_{ b }

^{2}, since there was little correlation between

*J*,

*a*

_{C}, and

*b*

_{C}. An estimate

*W*) of

*h*(

*W*) was also obtained from the reported acceptability of the color differences between the same samples that Wang et al. used for perceptibility judgments. Perceptibility results are presented in detail; acceptability results in summary form.

*W*) of the differential entropy

*h*(

*W*) was also obtained with a hard threshold based on the assumption of a uniform distribution of

*W*over a sphere of radius 0.3, as in Linhares et al. (2008).

*U*) and correspondingly lowest and highest estimated mutual information. The difference in the color gamuts of the two images is obvious. The corresponding estimates of the number

*N*of perceptually distinct surface colors were 5.5 × 10

^{2}and 4.1 × 10

^{4}.

*h*(

*U*) for surfaces, the differential entropy

*h*(

*W*) for observer responses based on Gaussian perceptibility data and on a uniform distribution (equivalent to a hard threshold of 0.3 in CIECAM02), the corresponding mutual information

*I*(

*U*;

*V*) from Equation 5, and the number

*N*of perceptually distinct surface colors from Equation 4.

Surface-color entropy h(U) | Observer entropy h(W) | Information I(U; V) | No. of colors N | ||
---|---|---|---|---|---|

Average | 12.53 | Gaussian | −0.29 | 12.82 | 7.3 × 10^{3} |

Uniform | −3.14 | 15.68 | 5.2 × 10^{4} | ||

Union | 15.14 | Gaussian | −0.29 | 15.43 | 4.4 × 10^{4} |

Uniform | −3.14 | 18.28 | 3.2 × 10^{5} |

^{3}and 4.4 × 10

^{4}for the average and union of the scenes, respectively.

*h*(

*W*) was −0.49 bits, i.e., 0.20 bits less than the −0.29 bits for the perceptibility data (Table 1). The estimated number of perceptually distinct surface colors was 8.3 × 10

^{3}and 5.1 × 10

^{4}for the average and union of the scenes, respectively, an increase of 14%.

*f*

_{ U }used to estimate surface-color entropy

*h*(

*U*), to mirror the estimates based on uniform distributions reported by Linhares et al. (2008) and Pointer and Attridge (1998). With the same method of estimation based on the offset Kozachenko–Leonenko estimator, the differential entropy

*h*(

^{5}and 3.1 × 10

^{6}discriminable colors, bracketing the value of 2.3 × 10

^{6}obtained by Linhares et al. (2008) and Pointer and Attridge (1998) for the union over scenes.

^{4}and 2.8 × 10

^{5}discriminable colors, also bracketing the value of 2.7 × 10

^{5}obtained by Linhares et al. (2008) for the average over scenes.

^{3}, more than an order of magnitude lower than the 2.7 × 10

^{5}discriminable colors reported by Linhares et al. (2008), with almost the same set of scenes. When these numbers were obtained not as averages over individual scenes but accumulated over the union of all 50 scenes, a similar disparity occurred. The number of perceptually distinct surface colors was 4.4 × 10

^{4}and the number of discriminable colors was 2.3 × 10

^{6}(Linhares et al., 2008).

*d*′ from signal-detection theory (Macmillan & Creelman, 2005).

*h*(

*W*) in Equation 3 is larger (Cover & Thomas, 1991) than

*h*(

*V*∣

*U*) in Equation 2.

*E** or Δ

*E*

_{00}.

*f*

_{ U }of surface colors with a particular illuminant.

^{5}. It is interesting that even with this extreme illuminant change, this number is much larger than the number 7.3 × 10

^{3}of perceptually distinct surface colors, suggesting that color constancy, or the lack of it, does not generally determine the extent to which surfaces may be identified by their color in natural scenes under different illuminants.