Previous research has suggested only weak statistical dependencies between local luminance and contrast in natural images. Here we study luminance and contrast in natural images using established measures and show that when multiple measurements of these two local quantities are taken in different spatial locations across the visual field, strong dependencies are revealed that were not apparent in previous pointwise (single-site) analyses. We present a few simple experiments demonstrating this spatial dependency of luminance and contrast and show that the luminance measurements can be used to approximate the contrast measurements. We also show that relying on higher-order statistics, independent component analysis learns paired spatial features for luminance and contrast. These features are shown to share orientation and localization, with the filters corresponding to the features dependent in their outputs. Finally, we demonstrate that the found dependencies also exist in artificial images generated from a dead leaves model, implying that simple image phenomena may suffice to account for the dependencies. Our results indicate that local luminance and contrast computations do not recover independent information sources from the visual signal. Subsequently, our results predict spatial processing of local luminance and contrast to be non-separable in visual systems.

^{2}) using the scale factors available on the data set Web page. Finally, we cropped 4 pixels from all sides in the images to avoid well-known border anomalies present in the data set images.

*channels,*although the word “channel” is used only for illustrative purposes. We do not propose that the corresponding channels explicitly existed as neural representations.

*w*to specify the localization,

*p*is the mask radius (in pixels), (

*i, j*) the location of the weighted pixel, and (

*x, y*) the current center of the weighting mask. Due to the cosine-based window being cyclic, we modified Equation 1 by zeroing out the

*w*

_{x,y}(

*i, j*) for which ∣∣(

*i, j*) − (

*x, y*)∣∣ >

*p*(i.e., the weights that lie outside the radius).

**I**is an input image,

*L*

_{0}≥ 0 a dark–light parameter, and

*C*= ∑

_{i,j}

*w*

_{x,y}(

*i, j*). The normalization coefficient

*C*is the same for all (

*x, y*). Note that the LUM, RMS, and RMSL channels are in spatial correspondence: for each location (

*x, y*), the measures are computed from exactly the same region of the input image

**I**using the same weighting. Sampling paired scalar variables from the LUM and RMSL channels of our image set would then equal the pointwise setting of (Frazor & Geisler, 2006; Mante et al., 2005), providing that no weighting masks

*w*between any two samples from the same image overlap.

*L*

_{0}was equal to 0, and in Frazor and Geisler (2006) it was set to either 0 or 1 cd/m

^{2}. In our preliminary experiments, we found that the value of

*L*

_{0}required to achieve pointwise decorrelation depends slightly on the used radius

*p,*as shown in Figure 1. The decorrelating value of

*L*

_{0}appears to be smaller for smaller radii

*p,*e.g., around 20 cd/m

^{2}for

*p*= 4, and 30 cd/m

^{2}for

*p*= 28. On the average over the tested radii,

*L*

_{0}for decorrelation was 28 cd/m

^{2}with a standard deviation of 3 cd/m

^{2}. As the obtained correlations between pointwise LUM and RMSL remain relatively small over the parameter ranges of

*L*

_{0}and

*p*in the plot of Figure 1, RMSL appears to be robust against the choices of

*L*

_{0}and

*p*over the whole data set. In the subsequent experiments, we used the value for

*L*

_{0}that achieved decorrelation for the used mask radius. Setting the value of

*L*

_{0}adversarily in this way allows us to show that the gain control solution applied by RMSL is not sufficient to make RMSL spatially independent from LUM.

**I**. Later, we describe control experiments where no afferents were shared, verifying that the spatial dependencies are not due to the convolutive processing.

*p*= 8 in Equation 1 and

*L*

_{0}= 22 cd/m

^{2}.

*L*

_{0}and the radius

*p*. This shows that these two parameters can slightly alter the statistical characteristics of the local luminance and contrast operator outputs.

*I*

_{1},

*I*

_{2}, and

*I*

_{3}be three horizontally adjacent 32 × 32 pixel image patches of the input image

*I,*with no overlap between the patches. We sampled 10,000 such triplets from the used data set and then computed the (single-variable) local LUM for

*I*

_{1}and

*I*

_{3}, and local RMSL for

*I*

_{2}, using

*p*= 16 and

*L*

_{0}= 27 cd/m

^{2}(for pointwise decorrelation). This gives a triplet of scalar variables, LUM(

*I*

_{1}), LUM(

*I*

_{3}), and RMSL(

*I*

_{2}), with slight abuse of notation. Note that each of the three variables has been computed from disjoint pixels and none of the postprocessings that we will use later for the ICA experiments have been performed. We computed a new variable

*D*= ∣LUM(

*I*

_{1}) − LUM(

*I*

_{3})∣ and its median

*M*(

*D*). If

*D*>

*M*(

*D*), we say the corresponding triplet has a high luminance difference (low, otherwise). Next we standardized the variable RMSL(

*I*

_{2}) to unit variance and then split it to two sets, depending if the luminance difference of the triplet was high or low. If RMSL is independent of LUM, the two sets should be statistically equivalent.

*I*

_{2}) in the high difference set is 1.26, and for the low difference set only 0.57. According to a Kruskal–Wallis test, the null hypothesis that the medians are equal can be rejected with

*p*< 0.001. We obtained similar results when the triplet was taken vertically or diagonally.

*p*in Equation 1 and then computed an estimate RMSL* based solely on the LUM channel (and not using the original image

**I**) but now using a very small radius

*p*= 2. The resulting approximate RMSL* image was then bilinearly interpolated to match the size of the true RMSL image. We performed this for 500 images sampled from our data set and computed the correlation coefficient between RMSL and RMSL* for each image and then averaged these obtained coefficients. Note that the choice of

*p*= 2 was heuristic. Here this choice suffices to show that a strong spatial relation exists between LUM and RMSL.

*p*. The obtained correlation is at the highest (mean

*c*= 0.88) with radius

*p*= 4 and decays with both larger radii and asymmetrically if the radii are different. With very disparate radii of the individual channel computations, the approximate RMSL* is a poor correlate of the real RMSL, suggesting that the dependency between the two quantities may be weaker if the channels have a large difference in their scale of computation. But with similar scales, the strong correlations indicate that far from being independent, the LUM channel actually contains enough structure to approximate the RMSL channel to a noticeable degree. The corresponding standard deviations are displayed in Figure 3B, showing that as the analysis moves towards a more global scale, the predictability of contrast from luminance gets more unstable with the used method. This may reflect the scale of the content in the images themselves.

**x**, which is a random vector of dimension

*n,*has been generated by a linear generative model

**is a constant mixing matrix (to be estimated) and**

*A***s**is a random vector of unknown independent and non-Gaussian components

*s*

_{ i}. The dimension of

**s**is assumed to be equal to the dimension of

**x**, possibly after the observed data

**x**have been reduced to a smaller dimension by principal component analysis (Hyvärinen et al., 2001). ICA tries to estimate both

**s**and the parameter matrix from the observed data

**x**; when the model holds, this can be done with very few assumptions (up to a scaling and ordering of the independent components and columns of matrix )

*.*When the data does not follow the generative model of Equation 5, many ICA algorithms—including the one we use—can still be interpreted as minimization of higher-order statistical dependencies between the components of the estimated random vector

**s**=

**Wx**≡

^{−1}

**x**(Hyvärinen et al., 2001). In addition, when statistical dependencies are reduced by searching for maximally non-Gaussian projections—as in our method—the method can be viewed as computing a sparse coding basis for the observed data.

*p*= 8 in Equation 1. With

*p*= 16, the results were qualitatively similar and are omitted. Next, we subsampled the resulting images by

*p*/2. This was done to address the increased redundancy on the luminance channel due to the low-pass filtering of Equation 2. We did not attempt to make the subsampling optimal. Then, we took a natural logarithm of the luminance channel data since otherwise the dependencies revealed by ICA were less pronounced (logarithm or other non-linearity corresponding to retinal non-linearity is typically applied when ICA is used to model luminance; see e.g., van Hateren & van der Schaaf, 1998). From the resulting channels of the 4,000+ images, we sampled two sets of corresponding 200,000 small 21 × 21 image patches at random. Each patch pair was obtained by sampling source image number and a point (

*x, y*) uniformly, so that a 21 × 21 region starting from the sampled point would remain inside the channels. Then, this region was cropped from each channel to form the luminance/contrast patch pair (one such patch pair is illustrated in Figure 2 as a rectangle drawn on the two channels). We removed the DC component (mean) from each contrast and luminance patch separately. The obtained patches were then vectorized, and the two data sets corresponding to the two channels were separately normalized to have unit variance and then separately submitted to principal components analysis to remove noise (dimensionality reduction is a standard procedure in ICA to reduce the effect of noise). We kept 95% of the variance of each set, corresponding to 66/441 and 203/441 dimensions for luminance and contrast, respectively.

**x**and

**y**are independent, so are

*f*(

**x**) and

*g*(

**y**)—postprocessing cannot make two independent random vectors dependent (Papoulis, 1991). Here, these postprocessing steps were done to enhance the clarity of the ICA results. For experiments where ICA was not used, the postprocessings described in this subsection were not applied.

**x**= [

**l**;

**r**] be a catenated data vector of luminance and contrast patches in correspondence. Now the weights of each column vector

**a**of the estimated matrix

**represent the learned visual features [**

*A***l***;

**r***] =

**a**of that component. The corresponding row vectors

**q**

^{ T}∈

**W**represent the linear filters (receptive field models) computing each component response, i.e., s =

**q**

^{ T}

**x**=

**q**

^{ T}[

**l**;

**r**] =

**q**

_{ l}

^{ T}

**l**+

**q**

_{ r}

^{ T}

**r**. How ICA learns to represent catenated luminance and contrast image patches using these definitions is shown in Figure 4. In the subsequent analyses, we study the emergent feature pairs [

**l***;

**r***], as well as dependencies between the filter responses

**q**

_{ l}

^{ T}

**l**and

**q**

_{ r}

^{ T}

**r**. This was done through reshaping both the features and the filters back to the spatial form of 21 × 21 pixels. Now, if according to ICA model, the two channels were independent, we would expect one of the filters

**q**

_{ l}or

**q**

_{ r}to be blank (with zero weights), with correspondingly blank feature as

**l*** or

**r***. Also, no dependencies should be observable between the filter outputs

**q**

_{ l}

^{ T}

**l**and

**q**

_{ r}

^{ T}

**r**in the condition of independence.

**l*** and

**r*** for the two channels. Most of the pairs turn out to have a similar pattern in both luminance and contrast channel. Some feature pairs and their corresponding filters have been magnified for convenience in Figures 5B and 5C. We will now turn to examine the learned dependencies more closely.

**l***;

**r***] for each component and the filter responses

**q**

_{ l}

^{ T}

**l**and

**q**

_{ r}

^{ T}

**r**on fresh data. We applied well-known Fourier techniques to the features

**l*** and

**r*** to demonstrate their regularities. This analysis would have been traditionally done to the filters, but in this case the filters turned out to be structurally less coherent than the features (compare Figures 5B and 5C). In Figure 6A, we display a histogram of correlation coefficients between the squared amplitudes of discrete 2D Fourier transforms of the feature pairs. The high correlation coefficients show that frequency and orientation are commonly shared by the pair. The other histogram in Figure 6A shows correlation coefficients of the spatial mask pairs, which is equal to computing the angle between the two vectors after they have been standardized to unit norm. A value of 1 indicates that the vectors are parallel (same phase), 0 that they are orthogonal, and −1 that they differ in the sign (opponent phase). It can be seen that most of the features lean toward phase agreement, but there is also a clear cluster of opponent phase features. Intuitively, since RMSL is invariant to stimulus phase, we would expect the filter pairs (

**q**

_{ l},

**q**

_{ r}) and (−

**q**

_{ l},

**q**

_{ r}) to be statistically identical in their responses, and hence that both phase-agreeing and phase-opponent features should occur in roughly equal proportions. However, as can be seen from Figure 6A, the phase-agreeing pairs dominate, and this tendency seemed to be stable across several bootstrapped experiments. We are uncertain whether this imbalance reflects some structure in the data or if it is a property of our modelling mechanism.

**q**

_{ l}and

**q**

_{ r}, we sampled 10,000 fresh patch pairs not used in the ICA estimation. In Figure 6B, we show a histogram of the correlation coefficients between

**q**

_{ l}

^{ T}

**l**and

**q**

_{ r}

^{ T}

**r**for all the filter pairs, computed over responses to the fresh patch pairs. We also show the correlation coefficients between the absolute values ∣

**q**

_{ l}

^{ T}

**l**∣ and ∣

**q**

_{ r}

^{ T}

**r**∣; these turned out to be 0.47 on the average. Together with the obtained shape pairs, this suggests that the same shapes tend to exist on both channels at the same time, but possibly with some phase difference. The lack of correlation in the signed outputs is likely because of linear filter output distributions on natural luminance images are typically symmetric, reflecting, e.g., that step edges are equally likely in both phases. However, regardless of luminance edge phase, the resulting contrast shape is the same. Hence, the sign differences of the two filter responses lead to smaller signed correlation (similar arguments for dependencies in filter–rectify–filter setting can be found in Johnson & Baker, 2004).

- We repeated our experiments so that all values inside a channel were computed from disjoint source image regions. This necessarily makes the obtained channels very low resolution (the image side length will be divided by the diameter 2
*p*of the localization mask). Also, the contrast channel shapes become less continuous. However, qualitatively similar oriented features and feature pairings were obtained in this control condition, with the difference that the obtained contrast shapes were generally shorter, reflecting the discontinuity introduced to the contrast channel. - To verify that the generation of shape pairings for luminance and contrast are not a by-product of ICA, we made a new catenated data set of random pairings of luminance and contrast patches. In this case the two channels were independent by design and this was reflected in the ICA results: each component represented an oriented feature only in one of the channels, with the other channel blank.
- We examined whether changing
*L*_{0}, the weighting mask or the radius*p*would remove the dependencies. Reasonable values of these parameters had visible effects on the learned features, but they still largely remained localized and oriented, with the corresponding filters bandpass. The qualitative dependencies between the channels appeared to be largely unaffected by these choices. - Are the presented results affected by performing the logarithmic non-linearity
*before*the local luminance computation instead of using it as a postprocessing step? The results were similar in both conditions. It seems that the existence of some compression of the luminance signal is more important than whether it is compressed before or after the local luminance channel computation. Without compression, the results produced by FastICA were much closer to suggesting independence, with the correlation coefficients of the absolute filter pair responses close to 0.2 on the average. - We examined if the degree of PCA dimensionality reduction affects the dependencies. Although it did change the learned features and the filters in a similar fashion as PCA affects ICA when ICA is used for luminance modelling, PCA dimensionality reduction did not change the tendency of the ICA features to appear as pairs.
- We repeated our ICA experiments with images of white (uncorrelated) noise that was sampled from the luminance distribution of the original image data. In this case, FastICA did not converge, which typically happens when the data is very far from the ICA model distribution or has more than one Gaussian direction (Hyvärinen et al., 2001).

*k*> 2 variables does not guarantee full factorization of the

*k*-dimensional joint distribution (Papoulis, 1991).

*s*= (

**q**

_{l}

^{T}

**l**) + (

**q**

_{r}

^{T}

**r**). Thus, our FastICA-based model estimation fits the weights for the two filters

**q**

_{l}and

**q**

_{r}, simultaneously processing both channels. Due to statistical dependencies present in the data, luminance and contrast signals become mixed by the model, provided that the luminance channel was compressed by a non-linearity. However, we do not propose that this emergent integration necessarily has any neural counterpart. Here the model was simply used as a tool to discover and characterize dependencies present in the data.

**I**of 2048 × 2048 pixels). Then, we painted disks of uniform intensity on the canvas so that each new disk was always painted behind the currently visible disks if there was overlap. This process was iterated until all untouched pixels were replaced by surface of some disk. The disks were placed randomly with random radius and random intensity. The disk placement was uniformly random and the disk radii sampled from 1/

*r*

^{3}distribution as in (Lee et al., 2001), with

*r*between 3 and 1024 pixels. The disk intensities were sampled from an exponential distribution with parameter

*μ*= 1. Finally, to avoid border effects, the central 1024 × 1024 pixel aperture of the canvas was taken as the final image.

*L*

_{0}. Second, contrast of a spatial location in the middle is dependent on the difference of luminances of two points that flank the middle location. Third, the RMSL channel can again be predicted from the LUM channel. Here the average correlation coefficient of the approximation to the actual channel was

*c*= 0.98, when mask radius

*p*= 8 was used. Finally, independent component analysis results in similar paired features on both channels where both features are share their spatial localization and orientation (not shown).

*p*= 8 in Equation 1. In Figure 8A, source images were from the dead leaves model, and in Figure 8B, they were from the two-tone model. Clear dependency between pointwise LUM and RMSL can be seen in the two-toned case. To some extent, this same behavior would also exist in natural images thresholded to two levels, suggesting the approximately continuous scale of possible intensities to be important in making pointwise luminance and contrast appear independent.

*p*= 8) could be more easily predicted from LUM in the dead leaves images (mean

*c*= 0.98) than in natural images (mean

*c*= 0.80, in Figure 3A) also hints at this direction. Yet, it is possible that some other correlate based on the LUM channel could match RMSL better in natural images than the simple application of RMSL we used. Our current results suffice to point out that simple image properties may play a statistically distinctive role in making local luminance and contrast dependent over natural images in general.