We investigate how the amount of information about colors in natural scenes available to the visual system depends on the spectral sensitivities of the three types of cones. We find that if we do not consider spatial information and low signal-to-noise situations, human cone spectral sensitivity curves do not provide the maximum possible information. This applies not only to information about all colors in natural scenes, but equally to information about colors of edible fruit. However, a significant increase in color information could only be obtained if the L-cone was sensitive to even longer wavelengths, at the expense of a reduction in spatial acuity and in the information available in dim lighting conditions.

Introduction

The color vision of humans and other Old World primates differs from most other animals. Primate color vision is trichromatic, in contrast to almost all other mammals, which are dichromats (Vorobyev, 2004). However, primate vision also differs from other animals (including some birds, fish, and insects), which have trichromatic or even tetrachromatic vision, in that two of the cones in the primate retina have a large degree of overlap in their sensitivities; the M- and L-cones have peak sensitivities at about 540 and 565 nm, respectively, whereas the S-cone sensitivity peaks at about 445 nm (Figure 1). These sensitivities show hardly any variation across all Old World primate species for which they have been measured, which is strong evidence that they are in fact an evolutionary adaptation (Deeb, Jorgensen, Battisti, Iwasaki, & Motulsky, 1994; Jacobs & Deegan, 1999).

Figure 1

Figure 1

Trichromacy offers advantages over dichromacy in almost any aspect of color vision—trichromats are generally better at detecting, segregating, and identifying colored objects (Mollon, 1989). However, it is generally thought to be most likely that trichromacy first arose in primates as an adaptation to finding colored food—fruit and/or edible leaves (Dominy & Lucas, 2001; Mollon, 1989; Osorio & Vorobyev, 1996). It is more difficult to understand why it should be advantageous to have a large overlap between the M- and L-cones' sensitivities. This overlap must lead to a high degree of correlation in the activities of M- and L-cones, which in turn means that much of the information in the cones' activities is redundant. It has been suggested that the spectral sensitivities of the cones are optimized for finding food (Osorio & Vorobyev, 1996), or more generally for detecting colored targets against a background of leaves (Sumner & Mollon, 2000a, 2000b). Another possibility is that the large overlap between L- and M-cones is better for spatial vision, and this advantage outweighs the disadvantage for color vision (Osorio, Ruderman, & Cronin, 1998).

To evaluate these hypotheses, it is first necessary to establish whether the assumption that the cone sensitivities do not deliver as much information about color as possible is correct. While it is true that a large overlap leads to M- and L-cone signals being correlated, to determine the effect of this on the information encoded by the retina, we need to take account of the statistical properties of the color signals in natural scenes. In the next part of the paper, we will derive estimates of the statistics of the color signal in natural scenes and describe how these statistics together with the cone sensitivity curves determine how much information about the color environment is encoded in the cone responses. Three sets of statistics are estimated, corresponding to three types of data on surface reflectances. Then, we determine how the amount of information about colors depends on the spectral tuning of each of the three cone types, for these three sets of statistics, and for various illuminants. We will find that this amount of information would be significantly increased if the L-cone were sensitive to even longer wavelengths. However, there would be a much smaller increase, if any, if the M- or S-cones were sensitive to shorter wavelengths, although this would reduce the overlap of the spectral sensitivity curves. We also show the results of a similar computation for dichromats, with only two types of cones; we find that the human L-cones are close to optimal for dichromats.

This work is an example of the use of information theory to understand the design of the early visual system. This approach has been fruitfully applied to several other aspects of the early visual system—see, for example, our earlier work on retinal and V1 receptive fields (Zhaoping, 2002) or retinal cone distributions (Lewis, Garcia, & Zhaoping, 2003); for further examples, see Sterling (2004).

Natural spectra and cone responses

To calculate how much information about colors in the environment is given by the cones, we first need to know how the light in natural scenes is distributed.

The spectra

*C*(*λ*) seen by the eye depend on both the illuminant,*E*(*λ*), and on the surface reflectances,*S*(*λ*), in the environment. We will assume that in any given natural scene there is only one illuminant, but of course there will be a wide range of reflectances, giving rise to a range of colors. The illuminant spectrum is therefore the simpler part to understand and to model, and for the most part we will take it to be the standard CIE daylight D65. However, to examine the effect of the illuminant on our results, we will also use CIE daylight spectra with correlated color temperature (CCT) of 4,000 K and 200,000 K (D65 has a CCT of 65,000 K), and two spectra which were recorded in shade in a tropical rainforest (http://vision.psychol.cam.ac.uk/spectra/spectra.html, see Sumner & Mollon, 2000a, 2000b; Figure 2). When we calculate the information in cone responses, we will assume that the illuminant spectrum is known so that color constancy is perfect. This is because the visual system can use information from the entire visual field to determine the illuminant, hence it is not necessary to estimate it independently for every colored surface.Figure 2

Figure 2

There are two types of data which can be used to estimate the distribution of reflectances in natural scenes. These are measurements of the reflectances of a variety of objects and hyperspectral images of natural scenes. As each of these has different advantages and disadvantages, we will use each type of data to obtain independent estimates of the natural statistics. We then determine how the information in cone responses depends on the wavelengths of the peak sensitivities of the cones, given those statistics.

Statistics of surface reflectances

A useful way to describe any reflectance by a small number of parameters is as a linear model (Maloney, 1986). We found the best fitting linear model for this set of reflectances by principal component analysis and used the first three principal components. The reflectance where

*S*(*λ*) of any surface is then characterized by three weights,*W*_{i},*i*= 1 … 3:$S(\lambda )\u2248 S \u2015(\lambda )+ \u2211 i = 1 3 W i s i(\lambda ),$

(1)

$ S \u2015$

( *λ*) is the mean reflectance, and*s*^{i}(*λ*) is the*i*th principal component.
In all three of the sets of data used (see below), the first three principal components account for approximately 98% of the variance in the data. Thus, the probability distribution of the weights with

*W*_{i}captures almost all of the statistical information about reflectances. It is easy to see that this distribution is not a normal distribution (see Figure 4). This is a consequence of the fact that*S*(*λ*) must lie between 0 and 1 for all values of*λ,*but values close to 0 or 1 occur with significant frequency. We therefore model the distribution of*W*_{i}as a truncated Gaussian distribution, as in Brainard and Freeman (1997), to take account of this constraint. The probability density function*p*(*W*_{1},*W*_{2},*W*_{3}) = 0 if*S*(*λ*) < 0 or*S*(*λ*) > 1 for any*λ*; otherwise,$p( W 1, W 2, W 3)=Nexp ( \u2212 \u2211 i = 1 3 ( W i \u2212 m i ) 2 2 v i ),$

(2)

*N*chosen so that ∫*p*(*W*_{1},*W*_{2},*W*_{3})d*W*_{1}d*W*_{2}d*W*_{3}= 1. The*m*_{i}and*v*_{ i}are chosen so that the mean and variance of*W*_{i}given by the truncated distribution are equal to the mean (i.e., 0) and variance of*W*_{i}in the original data. Three sets of data were used to obtain distributions of reflectances in this way:- Reflectances of 170 objects, measured by Vrhel, Gershon, and Iwan (1994). In using these data, we are assuming that these objects are a representative sample of surfaces in natural scenes.
- Hyperspectral images of four natural rural scenes, from Nascimento, Ferreira, and Foster (2002). In this case, each pixel yields a reflectance, and we used 10,000 pixels chosen randomly from the images. Note that although the original image data in this case were the radiance rather than the reflectance for each pixel, these radiances were converted to reflectances in Nascimento et al. Thus, this data set is directly comparable to the other two, although it was not measured in the same way. These are rural scenes dominated by leaves and grass.
- Four hundred eighty-two reflectances of fruit eaten by primates in Uganda, measured by Sumner and Mollon (2000a, 2000b) (http://vision.psychol.cam.ac.uk/spectra/spectra.html). We use the distribution derived from these data to see whether the cone sensitivities could be optimally adapted specifically for giving information about fruit, instead of for optimal color vision for the whole environment.

The first three principal components

*s*^{i}(*λ*) and their variances for each set of data are shown in Figure 3 and Table 1. The three sets of data have distinctly different statistics. The reflectance data of Vrhel et al. (1994) have the largest overall variance, indicating that it includes a wide variety of surfaces, whereas the fruit reflectances of Sumner and Mollon (2000a, 2000b), which include many similar objects, have lower variance. The reflectances from the hyperspectral images have even lower variance than the fruit—most of the scenes are made up of green plants, hence there is relatively little variation in color (Figure 4).Table 1

Table 1

1st PC | 2nd PC | 3rd PC | |||||||
---|---|---|---|---|---|---|---|---|---|

Variance | m _{ i} | v _{ i} | Variance | m _{ i} | v _{ i} | Variance | m _{ i} | v _{ i} | |

(i) | 0.94 (81%) | −1.0 | 1.8 | 0.14 (12%) | −0.18 | 0.17 | 0.052 (4.4%) | −0.050 | 0.053 |

(ii) | 0.25 (89%) | −1.3 | 0.96 | 0.021 (7.4%) | −0.004 | 0.02 | 0.006 (2.3%) | −0.022 | 0.007 |

(iii) | 0.32 (72%) | −0.06 | 0.38 | 0.1 (24%) | −0.016 | 0.11 | 0.008 (1.8%) | −0.002 | 0.008 |

Figure 3

Figure 3

Figure 4

Figure 4

The shapes of the principal components (Figure 3) show which part of the spectrum has the most variation in each case. For the fruit spectra, the variation is concentrated in the long wavelength part of the spectra, where all three components take large (positive or negative) values; there is very little variation at short wavelength where all three PCs are small. The wide range of objects in the data of Vrhel et al. (1994) vary in all parts of the spectrum, although the first PC has its largest values at long wavelengths. The green-plant-dominated hyperspectral images lie in between with most variation at long wavelengths but significant variation at around 550 nm as well.

Finally, the relative variance of the three PCs is different in the three cases. For the hyperspectral images, the first PC accounts for almost 90% of the variance. Therefore, a single cone type (or a single luminance channel) would suffice for almost all of the information about spectra in those types of scenes. In contrast, for the fruit spectra, the second PC contributes almost one quarter of the variance, hence a much greater proportion of the information about the spectra is only available to a visual system with two or more cones. Further, because there is so little variance at short wavelengths, we can see that an L-cone and an S-cone alone would not be able to determine the spectra accurately because the S-cone is most sensitive to wavelengths around 450 nm, but fruit reflectances vary mainly at longer wavelengths (unless the illuminant has so much more power at short wavelengths than long wavelengths that even the light reflected by red fruit vary mainly in short wavelengths). These results therefore illustrate why trichromatic vision is advantageous for identifying edible fruit, and hint that there might be no great advantage in trichromacy for vision in general in some environments, for animals that do not eat fruit.

Cone responses

The mean quantum catch where where and

*Q*_{i}of a cone of type*i*= 1, 2, 3 for L-, M-, and S-cones, respectively, for any stimulus*C*(*λ*), is given by$ Q i= \u222b 400 700C(\lambda ) R i(\lambda )d\lambda ,$

(3)

*R*_{i}(*λ*) is the spectral sensitivity of the cone. We take the stimulus*C*(*λ*) to be a product of an illuminant*E*(*λ*) and a reflectance*S*(*λ*), that is,*C*(*λ*) =*E*(*λ*)*S*(*λ*).^{1}When the reflectance is given by Equation 1, the resulting quantal catches can be conveniently written in matrix form$ Q \u2192\u2212 Q \u2192 \u2015\u2261 ( Q 1 \u2212 Q \u2015 1 Q 2 \u2212 Q \u2015 2 Q 3 \u2212 Q \u2015 3 )= ( K 11 K 12 K 13 K 21 K 22 K 33 K 31 K 32 K 33 ) ( W 1 W 2 W 3 )\u2261K W \u2192,$

(4)

$ K ij=\u222bE(\lambda ) R i(\lambda ) s j(\lambda )d\lambda $

(5)

$ Q \u2015$

_{ i}=$ \u222b 400 700$

*E*(*λ*)$ S \u2015$

( *λ*)*R*_{i}(*λ*)d*λ*is the mean output of a cone of type*i*in the illuminant*E*(*λ*).To determine how the cone responses depend on the tuning of the cones, we determined the values of

**K**using the spectral sensitivity curves of (Stockman & Sharpe, 2000), and also for hypothetical sensitivity curves obtained by keeping the shapes of the curves fixed but moving the entire curve to higher or lower wavelengths, thus changing the wavelength where the cone has its peak sensitivity, which we denote$ \lambda max L, \lambda max M$

, or $ \lambda max S$

. We therefore obtain a different matrix **K**for each set of cone sensitivity curves and for each set of reflectance principal components.The output where

*O*_{i}of a cone is proportional to the quantum catch*Q*_{i}at any fixed light level, but with a gain which depends on the overall level of illumination. Adapting to the illumination in this way ensures that the cone uses the full range of output current in illuminations that vary in intensity over several orders of magnitude (see, e.g., Cohn, 2004)—the quantum catch can vary from approximately 10^{2}to 10^{6}photo-isomerizations per integration time. The gain therefore has to be higher when the intensity of light is lower. It is reasonable to assume that the gain is set so as to ensure that the output has the maximum possible dynamic range. Then, the change in gain will be such as to keep the signal-to-noise ratio constant whenever possible. There are also various sources of noise, which are important at different levels of illumination (Cohn, 2004). Dark noise (*N*_{D}) is the background noise in the cone that exists even if there is no illumination, and it is independent of*Q*_{i}. Quantum noise [*N*_{P}(*Q*_{i})] arises from the quantum fluctuations in the number of photons detected and has a Poisson distribution, hence the variance is equal to*Q*_{i}. We group all other sources of noise, which occur at later stages in the generation of the cone's output current, together as “gain-control noise” (*N*_{G}), the relative importance of which is determined by the magnitude of the gain, and hence by the level of illumination. We can summarize this by representing the full cone output*O*_{i}as$ O i= G i [ Q i + N P ( Q i ) + N D ]+ N G,$

(6)

*G*_{i}is the gain. Each source of noise has zero mean and standard deviations*σ*_{D},*σ*_{P}=$ Q i$

and *σ*_{G}, respectively, hence the probability distribution for the cone output*O**given*_{i},$ W \u2192$

*,*can be written as$p( O i| W \u2192)= 1 2 \pi [ G i ( \sigma P + \sigma D ) + \sigma G ]\xd7exp [ \u2212 ( O i \u2212 G i Q i ) 2 2 [ G i ( \sigma P + \sigma D ) + \sigma G ] 2 ].$

(7)

In the above equation, we have used the fact that cone quantal catches are large enough that the Poisson distribution for

*N*_{P}can be approximated as a Gaussian, but with a standard deviation that depends on*Q*_{I}*,*that is,*σ*_{P}=$ Q i$

. We will consider three conditions, in each of which one source of noise is assumed to be dominant, so that the other two may be neglected. In very dim light, the dark noise will be dominant:

*σ*_{D}>>*σ*_{P}>>*σ*_{G}/*G*_{ i}. However, this situation applies only when the illumination is close to the threshold for cones to be active at all. At higher intensities, when more photons enter each cone, the quantum fluctuations in the number of photons detected will be much larger than*N*_{D}, hence*N*_{P}will be dominant:*σ*_{P}>>*σ*_{D}. At very high intensities,*G*_{i}becomes relatively small, ensuring that the cone output does not saturate, and quantum fluctuations become insignificant so that*σ*_{G}>>*G*_{i}*σ*_{P}, and*N*_{G}will be dominant. The exact ranges of intensities where each condition applies are not clear, but as we will see, both*N*_{P}and*N*_{G}lead to similar results, hence the distinction is not critical.As we vary the

*λ*_{max}, the wavelength where the cone has its peak sensitivity, the signal-to-noise ratio for each type of noise will vary in a different way. As the cone sensitivity curve changes, the variance of the quantum catch, which we denote*V**will in turn vary—if the cone is sensitive to a part of the spectrum for which the principal components are large, then the variance*_{Q},*V*_{Q}will be correspondingly large. In bright light when*N*_{G}dominates, the signal-to-noise ratio is*G*_{i}$ V Q$

/ *σ*_{G}. However, we expect the gain of the cone response to adapt to compensate for such a change in*V**so that the signal-to-noise ratio will be independent of the change in the cone sensitivity. In contrast, at lower light levels when quantum noise is most important, the variance of the noise is simply equal to*_{Q},*Q*and the signal-to-noise ratio is$ V Q$

/ *σ*_{P}=$ V Q / Q$

. An increase in *V*_{ Q}will therefore tend to lead to a higher signal-to-noise ratio. In particular, increasing the overall level of illumination by a fixed factor, so that the mean quantum catch increases from*Q*to*Q*′, will increase the variance*V*_{ Q}by a factor (*Q*′/*Q*)^{2}, hence the signal-to-noise ratio will increase by a factor of*Q*′/*Q*. Thus, when quantum noise dominates, an increase in intensity results in a higher signal-to-noise ratio. In contrast,*N*_{G}is dominant when the light is bright enough that further increasing the intensity of illumination brings no improvement in the signal-to-noise ratio. Finally, the dark noise is assumed to be independent of the cone sensitivity, hence the signal-to-noise ratio when dark noise dominates is$ V Q$

/ *σ*_{D}, which will increase with*V*_{Q}.Information in cone responses

If the cone sensitivities have been adapted to optimally identify surfaces by their reflectance, with no constraints on the possible values of

*λ*_{max}of the cones except that they lie between 400 and 700 nm, we would expect the cone outputs to give the maximum possible information about the reflectance in natural scenes. That is, the cone sensitivities would maximize the mutual information between cone outputs$ O \u2192$

and color inputs $ W \u2192$

. The mutual information is given by (e.g., Cover & Thomas, 1991): $I=\u222bp( O \u2192, W \u2192)log [ p ( W \u2192 , O \u2192 ) p ( W \u2192 ) p ( O \u2192 ) ]d W \u2192d O \u2192,$

(8)

*p*($ W \u2192$

), *p*($ O \u2192$

), and *p*($ W \u2192, O \u2192$

) are, respectively, the probability distributions of *W,**O,*and jointly (*W*and*O*).*p*($ W \u2192$

) is given by Equation 2, whereas *p*($ O \u2192, W \u2192$

) = *p*($ W \u2192$

) *p*($ O \u2192| W \u2192$

), with *p*($ O \u2192| W \u2192$

) given by Equation 7, and *p*($ O \u2192$

) = ∫ *p*($ O \u2192, W \u2192$

)d $ W \u2192$

. The value of *I*obtained is independent of the value of the gain*G*_{i}when quantum noise or dark noise is dominant, hence for simplicity we set*G*_{i}= 1 in that case. When the gain-dependent noise dominates,*I*depends only on the ratio*σ*_{G}/*G*_{i}, hence we can still set*G*_{i}= 1 and scale*σ*_{G}so as to keep the signal-to-noise ratio constant.We are mainly interested in the variation of

*I*as the values of$ \lambda max { L , M , S}$

change, rather than in the absolute value of *I*. For each of our three estimates of the probability distribution of$ W \u2192$

*,*we therefore computed the difference between the values of*I*given by the Stockman and Sharpe (2000) cone spectral sensitivities, and the values obtained when one of$ \lambda max L, \lambda max M$

, or $ \lambda max S$

is varied, without changing the other two cones. Ideally, we would instead allow all three cones to vary independently, so that we could find the set(s) of values of $ \lambda max { L , M , S}$

, which give the maximum possible information. This can be done if we make some additional approximations—see 1. To understand the significance of the changes in

*I*we have shown in the following sections, it may be helpful to discuss how it is related to the discriminability of surface colors. The ecologically relevant task is to discriminate between two surfaces with different reflectances (rather than, for instance, between equiluminant stimuli, or between monochromatic light with different wavelengths). We therefore consider the threshold for discrimination between two surfaces, which is the distance between the two stimuli (measured in, e.g., Macleod–Boynton color coordinates, or simply using the values of$ W \u2192$

as color coordinates), for which the difference in mean cone outputs is equal to the spread of the outputs. The simplest way in which both

*I*and discriminability can be changed is if the variance of the noise is increased or reduced (e.g., by changing the illuminant intensity in the quantum noise regime). In this case, it turns out (under certain assumptions) that the minimum discriminable differences between surface reflectances fall exponentially as*I*increases, as 2^{− I/3}(see 2). So, for example, if information is increased by 1 bit, the smallest discriminable color differences are reduced by 21%; whereas an increase of 0.1 bit only reduces the discriminable color differences by 2%. (Thresholds for discriminating equiluminant stimuli would increase as 2^{−I/2}, and for monochromatic light as 2^{−I}, provided that the visual system is able to make use of the knowledge that the stimuli are restricted in such a way.)In the following we will show the changes in

*I*when the noise is not changed but the spectral sensitivities of the cones are. In this case, rather than a uniform increase or decrease in discrimination thresholds, there will be some pairs of colors which become easier to discriminate and some which become more difficult.*I*can then be thought of as a statistical measure of how the thresholds will change for discrimination between typical pairs of colors from natural scenes.Results

The variation of

*I*with$ \lambda max { L , M , S}$

in D65 light is shown in Figures 5, 6, 7, & 8. Figure 9 shows that the results depend very little on the choice of illuminant. The absolute value of *I*is strongly dependent on the level of noise. The signal-to-noise ratio increases when the illumination becomes brighter, up to the point where quantum noise is unimportant. The signal-to-noise ratio can also be increased when the visual system can sum the output of nearby cones of the same type, at the expense of lowering the spatial resolution or spatial frequency cutoff of visual information (see Discussion). Even at a single level of illumination, there may therefore be different levels of signal to noise for different spatial frequencies. We therefore show only the variation of the information with the sensitivities. However, this variation is all that is needed to determine whether the information content is maximized. (In the following we set the signal-to-noise ratio to 10 in the gain-controlled noise-dominated regime. In the quantum noise-dominated regime, we set the maximum quantum catch to 10^{3}for the Stockman and Sharpe (2000) spectral sensitivities. The quantum catch is allowed to vary as the spectral sensitivities change, whereas in the gain-control regime the signal-to-noise ratio is kept constant. The total information is approximately 7 bits for the data of Vrhel et al. (1994), 6 bits for the Nascimento et al. (2002) images, or 5.5 bits for the Sumner and Mollon (2000a, 2000b) data for this level of quantum noise, but these numbers depend strongly on our arbitrary assumptions about noise. The changes in*I*shown in the figures do not depend on the noise unless the signal-to-noise ratio is much lower, as when cones are near threshold—see 1.Figure 5

Figure 5

Figure 6

Figure 6

Figure 7

Figure 7

Figure 8

Figure 8

Figure 9

Figure 9

S-cone sensitivity

Figure 5 shows the dependence of

*I*on$ \lambda max S$

for when the spectral sensitivity curves of the M- and L-cones are fixed, for each of the three sets of reflectance data in D65 light. The information is within 0.1 bit of the maximum achievable by changing the S-cone sensitivity. Although the results suggest that, in the gain-control regime, a small increase in information would be obtained if the S-cones were sensitive to even shorter wavelengths, this does not take into account the reduction in the number of photons detected that would occur due to absorption by the lens and macular pigment, which is very high at shorter wavelengths (Wandell, 1995). Even without taking this into account, there is a loss of information in the quantum noise regime if the S-cone becomes sensitive to shorter wavelengths because the quantum catch is reduced (Figure 2). M- and L-cone sensitivities

If the M-cone peak is varied while keeping it at shorter wavelengths than the L-cone peak, the maximum information is obtained when the separation between M- and L-cones is 10 or 20 nm larger than it is in primates, depending on which set of reflectance statistics we use (Figure 6). However, this local optimum position of the M-cone still gives a considerable overlap between M- and L-cones, rather than being midway between the L- and S-cones. The increase in information given by a shift of the M-cone to shorter wavelengths is only about 0.1–0.2 bits, much less than that given by moving the L-cone to longer wavelengths.

If we allow the M-cone to be sensitive to longer wavelengths than the L-cone, the information available increases dramatically, by 1 bit or more. Similarly, an increase in information is seen when the L-cone is sensitive to longer wavelengths (Figure 7). Thus, the maximum information is obtained when one of the cones has its peak sensitivity at a very long wavelength—in fact at as long a wavelength as possible in the gain-control regime. In the quantum noise regime, the maximum information is obtained when

*λ*_{max}is increased by about 100 nm, but the decrease at even longer wavelengths may be due to our assumption that there is no illuminant power above 700 nm, which causes the quantum catch to fall when*λ*_{max}is close to 700 nm (Figure 2).*I*is reduced if the M- and L-cones are closer together. We expect a minimum of the information to occur when

$ \lambda max M$

= $ \lambda max L$

, as the correlation between L- and M-cone outputs is then greatest. This is seen when $ \lambda max M$

is increased ( Figure 6), but not always when $ \lambda max L$

is reduced (Figure 7). This may be because the maximum overlap between the two sensitivity curves would occur if $ \lambda max M$

or $ \lambda max L$

was changed by 25 nm, but we have only moved the curves by multiples of 10 nm, or because reducing $ \lambda max L$

increases the correlation between the L- and S-cones. Effect of illuminant spectrum

The above results were all obtained under the assumption that the illuminant has the spectral power distribution of D65 daylight. However, Figure 9 shows that we obtain very similar results for other natural illuminants. The spectral sensitivities of the M- and L-cones needed to give maximum information are the same for all of the illuminants used. The optimal

*λ*_{max}for the S-cone does depend on the illuminant in the quantum noise regime, but it is always within 20 nm of the*λ*_{max}for the Stockman and Sharpe (2000) sensitivity curve.As can be seen from Figure 2, the quantum catch of a cone varies most strongly between illuminants when the cone is sensitive to short wavelengths. It is therefore unsurprising that we see the most variation with illuminant for the S-cone. We also do not expect to see the illuminant having an effect in the gain-control regime because the changes in gain compensate for variations in the average quantum catch. The information then depends only on the variation within a scene, which is primarily due to the variations in reflectance.

Optimal spectral sensitivities for dichromats

The above results suggest that to get maximum information about colors, one cone should be sensitive to the longest wavelengths. This leads us to ask, is the same true for species which have only an S- and L-cone, with no M-cone? This question is interesting for two reasons. Firstly, many New World monkeys are dichromatic, but their visual environment is, presumably, statistically similar to Old World monkeys, hence we might expect their cone sensitivities to be influenced by the same factors. Secondly, it is important to know the answer to this question to guide us in searching for explanations of our results for trichromats. The main feature of our results so far is that the L-cone needs to be sensitive to much longer wavelengths to give maximum information. We will therefore consider in the Discussion what costs sensitivity to longer wavelengths might have for vision, and whether these costs could be larger than the benefit of increased information about colors. Some of these costs, such as increased chromatic aberration, will apply only to trichromats (because dichromatic monkeys have only one type of cone in the fovea, where cone separations are comparable to aberration), whereas others, such as increased diffraction, will apply to dichromats as well as trichromats. If dichromats would benefit from sensitivity to longer wavelengths in the same way as trichromats, that would suggest that the same costs are likely to be relevant in the two cases. It would also suggest that the trichromat L-cone peak was not specifically adapted to finding colored food, but rather was subject to the same selective pressures as in dichromats.

As shown in Figure 10, the results depend on which set of statistics we use. For the statistics derived from the Vrhel at al. (1994) reflectances, increasing the peak wavelength for the L-cone leads to an increase in

*I,*although much more slowly than for trichromats. If we use the statistics for the Nascimento et al. (2002) images of rural scenes, however, the human L-cone is found to be very close to a local optimum for a dichromat.Figure 10

Figure 10

Discussion

The results in the previous section seem to show that the visual system could gain a considerable increase in information about colors in natural scenes if one of the cone types was sensitive to much longer wavelengths. This remains true, whether we consider scenes containing a wide range of objects, scenes dominated by green leaves, or if we only consider information about the light reflected by fruit. What all three sets of reflectance statistics have in common is that most of the variance is in the first two principal components: the first principal components are all biased toward longer wavelengths, and the second principal component has a large amplitude at long wavelengths. Provided the illuminant has sufficient radiant power at long wavelengths, a cone sensitive to longer wavelengths will therefore see more variation in its input than one sensitive to shorter wavelengths, and its output will be less strongly correlated with that of a cone that is sensitive to shorter wavelengths. Although the optimal cone sensitivity curves could in principle depend on the illuminant (e.g., there would be no point in having a cone sensitive to a part of the spectrum where there were no photons at all), we have found that there is in fact very little variation in the optimal values of

$ \lambda max { L , M , S}$

for natural illuminants. This reflects the fact that natural illuminants have a restricted range of spectra. Because there are strong reasons to believe that the cone sensitivities have been adapted to the environment by natural selection (see, e.g., Deeb et al., 1994; Jacobs & Deegan, 1999; Mollon, 1989), we have to consider possible explanations for the failure to take advantage of all the information available.

Constraints on cone sensitivities

Our results show that there are, broadly, two types of constraints that could favor the L- and M-cones having their peak wavelengths at about 565 and 540 nm. A constraint on the maximum wavelength achievable for the L-cone, or a constraint on the maximum separation of the L- and M-cones, would both lead to the optimal

*λ*_{max}being close to the true values (see Figures 6 and 8). We also found that the L- and S-cone sensitivities do give maximum information about scenes dominated by green plants, for dichromatic species.The peak sensitivity for the L-cone varies between 493 and 570 nm in mammals (Jacobs, 1993). The primate L-cones are close to the upper end of this range, and all the variation in the sensitivities of L- and M-cones may be due to mutations at just five sites (Yokoyama, 2000), hence we cannot rule out the possibility that it is simply impossible for a photopigment sensitive to longer wavelengths to evolve in primates. In that case, we could simply conclude that the cone sensitivities are similar to what we would expect if evolution had maximized the information about the environment available to the visual system, apart from the relatively small discrepancy in

$ \lambda max M$

. If we set aside this possibility, however, we have to consider that whereas sensitivity to longer wavelengths might increase information about colors, it could have a cost in terms of other aspects of vision. In particular, an increase in the peak wavelength for the L-cone could be deleterious for spatial vision at high spatial frequencies, and for vision in dim light, when the signal-to-noise ratio is low. Cone sensitivities and spatial vision

The optical performance of the eye is limited by diffraction and chromatic aberration. Diffraction is significant only in bright light, when the pupil is at its smallest, but its effect increases with the wavelength of light, and it could therefore constrain the maximum peak wavelength of the L-cone. Chromatic aberration would be weaker if both L- and M-cones were sensitive to longer wavelengths, but its effects would increase if the separation of the two cone types was greater. It would constrain the separation of the M- and L-cones. However, both would have similar consequences for the optimal positions of the M- and L-cones. The limit of spatial resolution of the human visual system (about 1 min of arc) is approximately equal to the angular size of a cone in the fovea, and to the point spread function of the eye's optics in bright sunlight (e.g., Wandell, 1995). This limit is equal to the limit imposed by diffraction in bright light, for light at about 550 nm (other factors besides diffraction become more important in determining the point spread function as light intensity decreases; Wandell, 1995). Increasing the wavelength to 650 nm would entail an increase in the limit of resolution by a factor of 550/650 ≈ 1.2. To estimate the resulting loss of information, note that this is equivalent to reducing the maximum spatial frequency available to the visual system by a factor of 1.2. Because the power spectrum of natural scenes typically has a 1/

*f*^{2}form, the total information available in all spatial frequencies up to a maximum*f*_{max}increases with*f*_{max}(as information ∼$ \u222b 0 f max$

log $( 1 f 2)$

*d*^{2}*f*∼$ f max 2$

). Hence, increasing the peak wavelength of the L-cone from 560 to 660 nm, which would reduce the maximum spatial frequency by a factor of 560/660, could lead to a reduction in the total information encoded in the retina by a factor of (560/660)^{2}≈ 0.7. Of course, this is only a rough estimate of the effect of increased diffraction—a full calculation should take into account the fact that the M-cone is still sensitive to shorter wavelengths, and that the cone mosaic cannot uniformly sample all spatial frequencies up to the maximum set by diffraction, as well as including a more detailed analysis of spatial correlations at different wavelengths. Chromatic aberration would have a similar effect if the separation of M- and L-cones was increased and would affect vision in all lighting conditions. Chromatic noise in the luminance channel would also increase if the separation of M- and L-cones increased (Osorio & Vorobyev, 1996). It therefore seems that these effects could impose a cost that would be greater than the benefit of having a cone sensitive to long wavelengths.To accurately quantify the effect of changing the L-cone spectral sensitivity on spatial vision, we should take into account the spatial correlations in natural scenes at all wavelengths. The spatial correlation of cone responses in natural scenes have been studied using hyperspectral images in Ruderman, Cronin, and Chiao (1998), where it was found that color and spatial correlations are independent, and in Parraga, Troscianko, and Tolhurst (2002), where it was found that scenes with red or yellow objects (such as fruit) in a background of leaves have different spatial statistics than most natural scenes. However, these studies do not tell us whether the same results would hold if the cone sensitivities were different.

Cone sensitivities and vision in dim light

While a large separation between L- and M-cones appears to be optimal when the signal-to-noise ratio is high, when the light level is near the threshold for cones to be active, the signal-to-noise ratio is much lower, and to give maximum information the cones need to be very close together (see Figure A3 in 1). This is because the difference in activities between L- and M-cones give almost no information when the noise is high, as it is small compared to the noise, hence information is maximized by ensuring that each cone detects as many photons as possible. The optimal position for both L- and M-cones would then be the same and would also be the same as the optimal position for the L-cone in dichromats (see 1). Because the true separation between M- and L-cones is larger than is optimal in dim lighting, but smaller than is optimal in bright light, it is possible that it has resulted from a compromise between the requirements of the two situations.

When light levels are low, the main source of noise in the cones is dark noise, which is independent of the light. One source of activity in cones which occurs even in the absence of illumination is thermal activation of the pigment molecules—the isomerizations that are usually triggered by light occur spontaneously at a low rate, with the energy required coming from heat instead of from an incident photon. The energy of a photon is proportional to (1/wavelength), hence longer wavelength photons have less energy. It has therefore been suggested that for a photopigment (and hence a cone) to have a larger

*λ*_{max}, it must also require less thermal energy to trigger an isomerization. In consequence, thermal isomerizations would occur at a higher rate, and there would be a higher level of dark noise (Barlow, 1957). Later studies of thermal noise in photoreceptors have largely confirmed this—the rate of thermal activation is expected to increase exponentially with 1/*λ*_{max}(Ala-Laurila, Donner, & Koskelainen, 2004). For instance, an increase in*λ*_{max}for the L-cone by 50 nm would lead to an increase in the rate of thermal activation by approximately a factor of 5 (see Figure 3 in Ala-Laurila et al., 2004).A fivefold increase in dark noise would increase the threshold for detection of light by L-cones by a similar amount and therefore would reduce the information available to the visual system at low light levels. However, thermal activation of photopigments is not the only source of dark noise in cones. In Schneeweis and Schnapf (1999), it was found that other sources dominated the dark noise in five of seven macaque monkey L- and M-cones. (For the other two cones, the results were consistent with the noise being caused by isomerizations together with some briefer events.) Thus, the impact of an increase in thermal isomerizations would depend on how much larger are the other sources of noise in most cones. It is however possible that any increase in

*λ*_{max}would give a significant increase in dark noise for some L-cones, and this would reduce the amount of information available to the visual system in dim light.What information should be maximized?

Up to now we have considered various factors that could affect the total information in cone responses, to see whether the cones might be optimized to give as much information as possible. This is a reasonable expectation as it would imply that the cones were adapted to give the best possible performance across all kinds of visual tasks. However, it is also likely that some colors are intrinsically more likely to be of ecological significance, hence more weight should be given to information about those colors.

The obvious example is again that the red colors of edible fruit may be more important than other colors. We have partly addressed this by computing the information about fruit colors, and our results suggest that the cone sensitivities required to give information about fruit are the same as those required for general color information. However, the information about fruit in a natural scene is not necessarily the same as the information we computed using fruit reflectances. This is because our calculation of information about fruit does not take into account the nonfruit colors in the environment at all. One way to allow for some colors (whether of fruit or some other set of colors) to be more important than others, while still taking account of the less important colors, would be to compare the information in scenes that contain examples of those colors to other scenes that do not.

It is also possible that cones are adapted to make decoding of the information in their responses easier for the later stages of the visual system, rather than to encode the maximum information. Because the red-green chromatic channel evolved later than the luminance channel (Mollon, 1989), it is possible that the cone sensitivities are adapted to give maximum information about the environment in general in the luminance channel, while using the red-green channel only for a few specific purposes. This is consistent with the result that the L-cone has the optimal sensitivity for dichromatic vision (Figure 10). Another possibility is that the cones are adapted to minimize variation in the red-green channel for a background of green leaves, making that channel most useful for detecting colored objects (such as fruit) (Sumner & Mollon, 2000a).

Conclusions

Our results show that as has long been suspected, the primate L- and M-cones give less information about colors than they could if the sensitivities were more widely separated. However, the results also show that if the M-cone was positioned half-way between the L- and S-cones, it would reduce the information about natural scenes. The only way to increase the information significantly is for the L-cone to be sensitive to much longer wavelengths. This should influence the choice of theory to explain why the cone sensitivities have remained stable in all Old World primates. While our results do not contradict the hypothesis that the L- and M-cones are optimized for the single task of detecting colors that stand out against a background of green leaves (Sumner & Mollon, 2000a), it is also possible that the cones give the best overall visual performance, either because it is not possible for an L-cone to be sensitive to longer wavelengths, or because of constraints on the L-cone peak due to the requirements of spatial vision and vision in dim light.

Appendix A: Gaussian approximation

In this appendix, we will make use of approximations that enable us to examine the variation of the information with the cone sensitivities for various sets of image statistics analytically to some extent and allow much faster numerical computation of the information. Specifically, we make the following two approximations: (i) that the weights

*W*_{i}that parameterize the reflectances have a Gaussian distribution; (ii) that the only noise in the cone responses is additive Gaussian noise.Then the cone responses and the information in the cone responses is now simply where

$ O \u2192$

are given by $ O \u2192= Q \u2192+ N \u2192$

*,*with$ Q \u2192$

= **K**$ W \u2192$

. If the covariance matrix of the weights $ W \u2192$

is **Σ**, and the variance of the noise is**N**, the cone responses$ O \u2192$

also have a Gaussian distribution with covariance **V**given by$ V = V Q + N V Q = K \xd7 \Sigma \xd7 K T ,$

(A1)

$I=H( O \u2192)\u2212 H noise= 1 2 log 2 [ ( 2 \pi e ) \u2212 3 det V ]\u2212 H noise,$

(A2)

*H*_{noise}=*H*($ O \u2192| W \u2192$

) is the entropy of the noise, which, we are assuming, is independent of the cone sensitivities. The first term on the rhs of Equation A2 is simply the entropy of a Gaussian distribution with variance **V**(see, e.g., Cover & Thomas, 1991). The optimal cone sensitivities are therefore those for which det[**V**] has its maximum value. That is, if Δ$ \lambda max i$

is the amount by which the *i*th cone type's sensitivity curve is shifted relative to the Stockman and Sharpe (2000) sensitivity, we will have$ \u2202 \u2062 det \u2062 V \u2202 ( \Delta \lambda max i )$

= 0 when the cones are at their optimum positions. The simplest case is when noise is negligible, that is, det

**N**<< det**V**_{Q}, in which case maximizing the information simply corresponds to maximizing the variance while minimizing the correlation in the quantal catches, det[**V**_{Q}] = det[**KΣK**^{T}]. This is the situation that applies in bright light and at low spatial frequencies where the signal-to-noise ratio is large. Using these approximations, it is feasible to vary numerically all three cone sensitivity curves simultaneously to find the optimum peak wavelengths; if we use the covariance matrix for the Vrhel et al. (1994) data as**Σ**, the maximum value of det[**V**] occurs when the changes in*λ*_{max}for the three cone types are Δ$ \lambda max L$

≈ 84 nm, Δ $ \lambda max M$

≈ 7 nm, and Δ $ \lambda max S$

≈ 10 nm. In other words, the Gaussian approximation gives the same qualitative results as the full calculation: the M- and S-cones are close to their optimum positions but moving the L-cone to longer wavelengths gives an increase in information. This can also be seen by varying the position of one cone at a time, as before—see Figure A1. The optimum change in $ \lambda max M$

and $ \lambda max S$

is negative if only one cone is varied with the other two fixed, as in our previous results, although both are positive when all three cones can vary simultaneously. We therefore expect that the results of the full computations (Figures 5–10) can to some extent be explained by the variation of the covariance matrix **V**_{Q}as the cone sensitivities are varied.Figure A1

Figure A1

It is therefore relevant to examine how

**V**_{Q}depends on the values of*λ*_{max}for the cones. From Equation A1,**V**_{Q}depends on**Σ**and**K**. We note that**Σ**=$ ( \Sigma 1 0 0 0 \Sigma 2 0 0 0 \Sigma 3 )$

is a diagonal matrix with **Σ**_{i}being the variance of the*i*th principal component (given in Table 1). The elements of**K**,*K*_{ij}, are defined in Equation 5. The first index refers to a cone type and the second to a principal component, hence for clarity we define$ K j L$

≡ *K*_{1j},$ K j M$

≡ *K*_{2 j}, and$ K j S$

≡ *K*_{3 j}for the L-, M-, and S-cones, respectively. Hence, for example,$ K j L$

= ∫ *E*(*λ*)*R*_{L}(*λ*)*s*^{ j}(*λ*)d*λ*.Because

$ K j L$

depends on the spectral sensitivity *R*_{L}(*λ*), we can consider it to be a function of*λ*_{max}for the L-cone, and similarly with$ K j M$

and $ K j S$

for the M- and S-cones, respectively. The values of $ K j L$

as a function of *λ*_{max}are shown in Figure A2.Figure A2

Figure A2

From Equation A1, the elements of and the variance of the M- and S-cones are given by replacing L by M or S, respectively. Similarly, the covariances between two cone types are where, for example,

**V**_{Q}are*V*_{Qij}=*K*_{i1}*K*_{j1}**Σ**_{1}+*K*_{i2}*K*_{j2}**Σ**_{2}+*K*_{i3}*K*_{j3}**Σ**_{3}; so for example, the variance of the L-cone's quantum catch is$ \u2329 ( Q L \u2212 Q \u2015 L ) 2 \u232a= V Q i i\u2261 V Q LL=( K 1 L ) 2 \Sigma 1+( K 2 L ) 2 \Sigma 2+( K 3 L ) 2 \Sigma 3,$

(A3)

$ V LM Q = K L 1 K M 1 \Sigma 1 + K L 2 K M 2 \Sigma 2 + K L 3 K M 3 \Sigma 3 V LS Q = K L 1 K S 1 \Sigma 1 + K L 2 K S 2 \Sigma 2 + K L 3 K S 3 \Sigma 3 V MS Q = K M 1 K S 1 \Sigma 1 + K M 2 K S 2 \Sigma 2 + K M 3 K S 3 \Sigma 3,$

(A4)

$ V Q L \u2062 M\u2261 \u2329 ( Q L \u2212 \raise 5.35pt\over\scale150% Q \u2015 L ) ( Q M \u2212 \raise 5.35pt\over\scale150% Q \u2015 M ) \u232a$

, etc. Information in a single cone type

In the Gaussian approximation, the amount of information in the output of a cone is completely determined by the signal-to-noise ratio. In the gain-controlled noise regime, this is independent of

*λ*_{max}. However, in the quantum noise regime, the signal-to-noise ratio itself depends on*λ*_{max}. This difference is most noticeable in the case of the Nascimento et al. (2002) image statistics, where the total information varies much more strongly with*λ*_{max}in the quantum noise regime than in the gain-control regime (Figures 6 and 7). The large amplitude of the principal components at long wavelengths (Figure 3) implies that the cone output has a larger variance if it is sensitive to long wavelengths.Correlations between cone types

Because the information in the output of each cone type is approximately independent of

*λ*_{max}in the high signal-to-noise regime, the information is maximized when the correlation between cone types is minimized, which ensures that as little of the information as possible is redundant. Thus, in the Gaussian approximation, the problem of maximizing information is equivalent to minimizing the covariances$ V Q L \u2062 M, V Q L \u2062 S$

, and $ V Q M \u2062 S$

, which depend on *λ*_{max}through$ K j L$

, etc., as in Equation A4. To understand how minimizing these quantities can explain our results, we first note that $ K 1 L, K 1 M$

, and $ K 1 S$

, which parameterize the contribution of the first principal component to the cones' outputs, vary very little with *λ*_{max}(these are the solid lines in Figure A2). The terms proportional to$ K 1 L, K 1 M$

, and $ K 1 S$

in Equation A4 cannot therefore contribute strongly to the changes in the total information. We can also notice from Table 1 that **Σ**_{2}>**Σ**_{3}(the 2nd PC has much greater variance than the 3rd), hence the terms proportional to**Σ**_{3}in Equation A4 will be less important than the terms proportional to**Σ**_{2}. In other words, to minimize$ V Q L \u2062 M$

, $ K 2 L K 2 M$

**Σ**_{2}should be close to its lowest possible value (meaning it should be negative, with a large absolute value). Similarly,$ K 2 L K 2 S$

**Σ**_{2}and$ K 2 M K 2 S$

**Σ**_{2}should be as low as possible to minimize$ V Q L \u2062 S$

and $ V Q M \u2062 S$

, respectively. One consequence of the above reasoning is that, to obtain large changes in total information,

**Σ**_{2}must be relatively large (compared to**Σ**_{1}). In the case of the statistics for the Nascimento et al. (2002) images,**Σ**_{2}is very small and indeed we found only small changes in total information in the gain-control regime. We can also explain our finding that the maximum total information is obtained when the*λ*_{max}for the L-cone is very large—as can be seen from Figure A2,$ K 2 L$

is large and negative in that case (for the Vrhel et al., 1994, data set). Because both $ K 2 M$

and $ K 2 S$

are positive, the products $ K 2 L K 2 M$

**Σ**_{2}and$ K 2 L K 2 S$

**Σ**_{2}will both be negative, hence the correlation of the L-cone output with both the M- and S-cone outputs will be small. Similar reasoning can be applied to the other data sets. However, it is important to remember that the non-Gaussian nature of the input is also significant, as can be seen by comparing Figure A1 with Figures 5, 6, & 7; we should not expect to explain our results completely using the covariance matrix alone.Effect of noise

In bright light, the signal-to-noise ratio is high and the contribution of noise to the covariance of the cone response is negligible, hence the noise contributes to the information only through the second term in Equation A2. The mutual information then depends on the variance of the noise where the last term is the entropy of a three-dimensional Gaussian distribution with covariance matrix

**N**:$I=H( O \u2192)\u2212 H noise=H( O \u2192)\u2212 1 2 log 2 [ ( 2 \pi e ) \u2212 3 det N ],$

(A5)

**N**. Because*H*($ O \u2192$

) is independent of **N**(provided det[**V**_{Q}] << det[**N**]), a change in the level of noise simply increases or reduces the information in the cone responses by an amount which is independent of the cone sensitivities. For this reason, the changes in information that were computed for a particular level of noise in this paper are not expected to change if the noise (only) changes.When the noise is relatively large, the information depends on the noise in a more complicated way—see Figure A3. As the level of noise increases, the optimal separation of the L- and M-cones is reduced and can be very close to zero. This occurs when the signal-to-noise ratio is of order 1, or slightly higher. This is likely to be the case in mesopic conditions, when both rods and cones can provide useful information and dark noise in cones is significant. While Figure A3 shows that the optimal separation may be very small, the optimal position of both M- and L-cones (in the Gaussian approximation) is then at about 610–50 nm longer than the true peak L-cone sensitivity.

Figure A3

Figure A3

To understand this result, note that at such high levels of noise, the total information provided by each cone will be much lower than in brighter light. In addition, as the output of each cone is to a significant extent determined by the noise, it becomes less important to reduce the redundancy, and more important to maximize the signal-to-noise ratio for each cone individually, leading to the L- and M-cones being sensitive to the same part of the spectrum.

Appendix B: Mutual information and discrimination

Using the Gaussian approximation as in 1, we can relate the mutual information between cone responses and surface reflectances to the discriminability of surface colors. As before, the cone responses are with

$ O \u2192$

= **K**$ W \u2192$

+ $ N \u2192$

, and given independent Gaussian noise for each cone type, so that **N**=*N***1**, with**1**the 3 × 3 identity matrix, we have from Equation A5 that$I= I \u02dc\u2212 3 2 log 2N,$

(B1)

*I˜*=$ 1 2$

log_{2}det**V**being independent of*N*when the signal-to-noise ratio is high enough that det**V**≈ det**V**_{Q}.To see how discrimination of surfaces depends on where (…) indicates terms which are independent of

*N,*we can consider a maximum likelihood estimate of the reflectance. Because the distributions*p*($ O \u2192| W \u2192$

) and *p*($ W \u2192$

) are Gaussian with variance **N**and**Σ**, respectively, the log-likelihood for$ W \u2192$

given an observed output $ O \u2192$

is $ log [ p ( W \u2192 | O \u2192 ) ] = log [ p ( O \u2192 | W \u2192 ) p ( W \u2192 ) p ( O \u2192 ) ] = log [ p ( O \u2192 | W \u2192 ) ] + log [ p ( W \u2192 ) ] + \cdots = \u2212 1 2 [ ( O \u2192 \u2212 K W \u2192 ) T N \u2212 1 ( O \u2192 \u2212 K W \u2192 ) + W \u2192 T \Sigma \u2212 1 W \u2192 ] + \cdots = \u2212 1 2 [ ( W \u2192 est \u2212 W \u2192 ) T ( \Sigma \u2212 1 + K T N \u2212 1 K ) \xd7 ( W \u2192 est \u2212 W \u2192 ) ] + \cdots ,$

(B2)

$ W \u2192$

, and the maximum likelihood estimate $ W \u2192$

_{est}is$ W \u2192 e \u2062 s \u2062 t=( \Sigma \u2212 1+ K T N \u2212 1K ) \u2212 1 K T N \u2212 1 O \u2192=(N( K T ) \u2212 1 \Sigma \u2212 1+K ) \u2212 1 O \u2192.$

(B3)

When the signal-to-noise ratio is very large, this reduces to an unbiased estimate Equivalently, we could say that

$ W \u2192$

_{est}=**K**^{−1}$ O \u2192$

. Because $ O \u2192$

= **K**$ W \u2192$

+ $ N \u2192$

, the covariance matrix of the uncertainty in the estimate is then $ V e \u2062 s \u2062 t= K \u2212 1\xd7N\xd7( K \u2212 1 ) T=N(K K T ) \u2212 1.$

(B4)

**V**_{est}is the inverse of the Fisher Information matrix for the cone outputs.Knowing the variance of the estimator, we can now say that two surfaces with reflectances

$ W \u2192$

and $ W \u2192\u2032$

can be discriminated if the difference between them is larger than the spread of the estimator (in the direction of the difference). That is, $ W \u2192$

and $ W \u2192\u2032$

are just discriminable when $[ W \u2192\u2032\u2212 W \u2192 ] T\xd7 V est \u2212 1 \xd7[ W \u2192\u2032\u2212 W \u2192]= N \u2212 1[ W \u2192\u2032\u2212 W \u2192 ] T\xd7 K TK\xd7[ W \u2192\u2032\u2212 W \u2192]=1.$

(B5)

Thus, the sizes of the smallest discriminable differences in surface colors are inversely proportional to

$ N$

. Together with Equation B1, this implies that the smallest discriminable differences in surface colors are inversely proportional to 2 ^{I/3}. The thresholds given by Equation B5 are for a distance measured by differences in the values of$ W \u2192$

, but the same relation with *I*holds for any color coordinate, which are linearly related to the reflectances and the quantum catches$ Q \u2192$

, such as the Macleod–Boynton coordinates. Note that we are considering discrimination in the full three-dimensional color space. In a psychophysical setting where the task is to discriminate between equiluminant colors, there would be variation in only two dimensions and we would instead have 2^{I/2}. Similarly, for discrimination in a one-dimensional stimulus space such as monochromatic light, thresholds for a maximum likelihood estimator would be inversely proportional to 2^{I}.When it is the cone sensitivities (or the illuminant spectrum) that is changing, rather than the noise only, this simple relationship between mutual information and discrimination no longer holds. Instead, because the discriminability of any two given colors depends on

**K**, which in turn depends on the cone sensitivity curves, any change in the cone sensitivity curves is bound to lead to some pairs of colors becoming easier to discriminate whereas others become more difficult. This could be beneficial if the colors that become harder to discriminate occur only rarely, whereas common colors become easier to discriminate. We can therefore regard*I*as a statistical parameter that describes how good discrimination of surface colors is “on average” in some sense.Acknowledgments

Commercial relationships: none.

Corresponding author: Alex Lewis.

Email: alex.lewis@ucl.ac.uk.

Address: Department of Psychology, University College London, United Kingdom.

Footnote

1 The surface reflectances and illuminant intensities we use, as well as the cone sensitivity curves, are given at 10-nm intervals, and so all integrals over

*λ*are computed as sums, for example,**∫***C*(*λ*)*R*_{ i}(*λ*)d*λ*= Σ_{n}*C*(*λ*_{n})*R*_{i}(*λ*_{n}), with*λ*_{n}= 400, 410,…,700 nm. The Sumner and Mollon (2000a, 2000b) reflectances for fruit are given at 4-nm intervals and were converted to 10-nm intervals by linear interpolation.References

Ala-Laurila, P.
Donner, K.
Koskelainen, A.
(2004). Thermal activation and photoactivation of visual pigments.

*Biophysical Journal*, 86, 3653–3662 [PubMed] [Article] [CrossRef] [PubMed]
Barlow, H. B.
(1957). Purkinje shift and retinal noise.

*Nature*, 179, 255–256. [PubMed] [CrossRef] [PubMed]
Brainard, D. H.
freeman, W. T.
(1997). Bayesian color constancy.

*Journal of the Optical Society of America A, Optical, Image Science, and Vision*, 14, 1393–1411. [PubMed] [CrossRef]
Cohn, E. H.
Chalupa, L. M.
Werner, J. S.
(2004). Thresholds and noise.

*The visual neurosciences*(Vol.1). Cambridge, Mass: MIT Press.
Cover, T. M.
Thomas, J. A.
(1991). Elements of information theory. New York: Wiley.

Deeb, S. S.
Jorgensen, A. L.
Battisti, L.
Iwasaki, L.
Motulsky, A. G.V
(1994). Sequence divergence of the red and green visual pigments in great apes and humans.

*Proceedings of the National Academy of Sciences of the United States of America*, 91, 7262–7266. [PubMed] [Article] [CrossRef] [PubMed]
Dominy, N. J.
Lucas, P. W.
(2001). Ecological importance of trichromatic vision to primates.

*Nature*, 410, 363–366. [PubMed] [CrossRef] [PubMed]
Jacobs, G. H
(1993). The distribution and nature of colour vision among the mammals.

*Biological Reviews of the Cambridge Philosophical Society*, 68, 413–471. [PubMed] [CrossRef] [PubMed]
Jacobs, G. H.
Deegan, J. F.II
(1999). Uniformity of colour vision in Old World monkeys.

*Proceedings: Biological Sciences/The Royal Society*, 266, 2023–2028. [PubMed] [CrossRef] [PubMed]
Lewis, A.
Garcia, R.
Zhaoping, L.
(2003). The distribution of visual objects on the retina: Connecting eye movements and cone distributions.

*Journal of Vision*, 3(11), 893–905, http://www.journalofvision.org/3/11/21/, doi:10.1167/3.11.21. [PubMed] [Article] [CrossRef] [PubMed]
maloney, L. T
(1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters.

*Journal of the Optical Society of America A, Optics and Image Science*, 3, 1673–1683. [PubMed] [CrossRef] [PubMed]
Nascimento, S. M.
Ferreira, F. P.
Foster, D. H.
(2002). Statistics of spatial cone-excitation ratios in natural scenes.

*Journal of the Optical Society of America A, Optics, Image Science, and Vision*, 19, 1484–1490. Retrieved from http://personalpages.umist.ac.uk/staff/david.foster/Hyperspectral_images_of_natural_ scenes_02.html [PubMed] [CrossRef] [PubMed]
Osorio, D.
Ruderman, D. L.
cronin, T. W.
(1998). Estimation of errors in luminance signals encoded by primate retina resulting from sampling of natural images with red and green cones.

*Journal of the Optical Society of America A, Optics, Image Science, and Vision*, 15, 16–22. [PubMed] [CrossRef] [PubMed]
Osorio, D.
Vorobyev, M.
(1996). Colour vision as an adaptation to frugivory in primates.

*Proceedings: Biological Sciences/The Royal Society*, 263, 593–599. [PubMed] [CrossRef] [PubMed]
Parraga, C. A.
Troscianko, T.
Tolhurst, D. J.
(2002). Spatiochromatic properties of natural images and human vision.

*Current Biology*, 12, 483–487. [PubMed] [CrossRef] [PubMed]
Ruderman, D. L.
Cronin, T. W.
Chiao, C.
(1998). Statistics of cone responses to natural images: Implications for visual coding.

*Journal of the Optical Society of America A*, 15, 2036–2045. [CrossRef]
Sterling,
Chalupa, L. M.
Werner, J. S.
(2004). How retinal circuits optimize the transfer of visual information.

*The visual neurosciences*(Vol. 1). Cambridge, MA: MIT Press.
Stockman, A.
Sharpe, L. T.
(2000). The spectral sensitivities of the middle- and long-wavelength-sensitive cones derived from measurements in observers of known genotype.

*Vision Research*, 40, 1711–1737. Retrieved from http://cvrl.ioo.ucl.ac.uk/ [PubMed] [CrossRef] [PubMed]
Vorobyev, M.
(2004). Ecology and evolution of primate colour vision.

*Clinical & Experimental Optometry: Journal of the Australian Optometrical Association*, 87(4–5), 230–238. [PubMed] [Article] [CrossRef] [PubMed]
Vrhel, M. J.
Gershon, R.
Iwan, L. S.
(1994). Measurement and analysis of object reflectance spectra.

*Color Research and Applications*, 19, 4–9. Retrieved from ftp://ftp.eos.ncsu.edu/pub/eos/pub/spectra/
Wandell, B. A.
(1995). Foundations of vision. Massachusetts: Sinauer Associates.

Yokoyama, S.
(2000). Molecular evolution of vertebrate visual pigments.

*Progress in Retinal and Eye Research*, 19, 385–419. [PubMed] [CrossRef] [PubMed]
Zhaoping, L.
Arbib, M.
(2002). Optimal sensory encoding.

*The handbook of brain theory and neural networks: The second edition*(pp. 815–819). MIT Press.