A pioneering study by J. M. Harris and A. J. Parker (1995) found that disparity judgments using random-dot stereograms were better for stimuli composed of mixed bright and dark dots than when the dots were all bright or all dark. They attribute this to an improvement in stereo correspondence. This result is hard to explain within current models of how stereo correspondence is achieved. However, their experiment varied task difficulty by adding disparity noise. We wondered if this might challenge mechanisms subsequent to the solution of the correspondence problem rather than mechanisms that solve the correspondence problem itself. If so, this would avoid the need to modify current models of stereo correspondence. We therefore repeated Harris and Parker's experiment using interocular decorrelation to vary task difficulty. This technique is believed to probe stereo correspondence more specifically. We observed the efficiency increase reported by Harris and Parker for mixed-polarity dots both using their original technique of disparity noise and using interocular decorrelation. We show that this effect cannot be accounted for by the stereo energy or by simple modifications of it. Our results confirm Harris and Parker's original conclusion that mixed-polarity dots specifically benefit stereo correspondence and point up the challenge to current models of this process.

^{2}, white was 49 cd/m

^{2}, and black was 0.7 cd/m

^{2}.

*N*= 396 except where stated. There were 3 contrast conditions: mixed polarity (equal numbers of black and white dots, Figure 2a), all black, and all white (Figure 2b). Observers completed runs consisting of 600 trials in total, made up of 200 all-black, 200 all-white, and 200 mixed-polarity conditions randomly interleaved. The task was always to decide which side of the stimulus, left or right, was closer to the observer. Subjects were allowed to view the stimulus for as long as they wanted; the stimulus only advanced once a response was reported via mouse button press.

*C,*a fraction (1 −

*C*) of the dots were placed at independent locations in the two eyes (still within the 106 × 106 arcmin extent of the stimulus). The remaining

*CN*dots were placed in identical vertical locations on the screen, with horizontal disparity randomly drawn from a Gaussian distribution about the mean, with standard deviation

*σ*. We did not slope the stimulus disparity back to zero at the edges of the screen, as Harris and Parker did, meaning that our stimulus offered a monocular cue to depth. In practice, this cue was much less helpful than the disparity cue, and the fact that we obtained the same results as Harris and Parker indicates that this difference in the stimulus was not important.

*C*), we picked a random cyclopean position (

*x, y*), with

*x*and

*y*drawn independently with uniform probability from the range [−

*W, W*], where

*W*was the stimulus half-width and (0, 0) was the center of the stimulus. We gave the dot disparity

*δ,*drawn at random from a Gaussian distribution with mean ± sign(

*x*

_{c}) × Δ/2 and standard deviation

*σ*. The ± controls whether the left side or right side is nearer; on each trial, either + or − was chosen at random. We use a sign convention in which near disparity is negative. After adding this disparity, the dot's position in the left and right eyes was (

*x*

_{L},

*y*

_{L}) = (

*x*

_{c}−

*δ*/2,

*y*) and (

*x*

_{R},

*y*

_{R}) = (

*x*

_{c}+

*δ*/2,

*y*), respectively. If the dot was not binocularly correlated (probability 1 −

*C*), we drew (

*x*

_{L},

*y*

_{L}) and (

*x*

_{R},

*y*

_{R}) independently from [−

*W, W*].

*x*

_{L},

*y*

_{L}) and the centers of all dots already in place in the left eye and similarly for the right eye. If any of these distances exceeded the dot diameter, we rejected that position and generated a new pair (

*x*

_{L},

*y*

_{L}) and (

*x*

_{R},

*y*

_{R}). We repeated this process until a non-overlapping position had been found. For the mixed-polarity condition, we finally chose the dot's color, black or white with equal probability. This process was repeated until all

*N*dots had been placed in the stimulus. Figure 2 shows two sample stimuli.

*s,*reflecting both the externally applied disparity noise

*σ*and any internal noise. Finally, we assume that human observers manage to average only some fraction

*E*of the available dots, where

*E*is by definition the statistical efficiency of the observer. If the observer averages the disparity of

*M*dots, each of which has disparity drawn from the normal distribution

*N*(Δ/2,

*s*), the resulting average disparity is drawn from

*N*(Δ/2,

*s*/√

*M*), where Δ is the relative disparity between the two surfaces separated by the disparity step. The distributions of this average signal on either side of the depth boundary are shown in Figure 3. We assume that the observer judges which side is closer by comparing the two numbers drawn from these distributions and assigns the closer side to be that with the smaller number. The observer's probability of getting the correct answer is then

*s*is an unknown function of

*σ*and erf is the Gaussian error function or probability integral. We can rearrange this to derive the number of dots that the observer is using from each side of the depth boundary, given their performance level

*P*:

*N,*correlation

*C,*relative disparity Δ, and disparity noise

*σ*are all the same. If we average trials in the two same-contrast conditions, the efficiency ratio is then

*R*. For each condition (B, W, BW),

*P*

_{condition}is defined as

*n*

_{correct}/

*n*

_{trials}. We used the binornd function in Matlab to generate a new

*n*

_{resamp}from a binomial distribution with parameters

*P*

_{condition}and

*n*

_{trials}. This produced a new estimate of

*P*

_{condition},

*P*

_{resamp}=

*n*

_{resamp}/

*n*

_{trials}. We did this for each condition so as to arrive at a new

*R*

_{resamp}. We repeated this 10,000 times and took the 95% confidence limits to be the 2.5% and 97.5% percentiles of the resulting set of

*R*

_{resamp}.

*R*is constant independent of the stimulus parameters, we can ask how we should adjust task difficulty in order to maximize the difference in performance between same- and mixed-polarity conditions. Suppose that

*P*

_{same}is the proportion correct in both same-contrast conditions (

*P*

_{B}=

*P*

_{W}=

*P*

_{same}), and

*P*

_{BW}is the proportion correct in the mixed-contrast condition. Define

*P*

_{mean}= (

*P*

_{BW}+

*P*

_{same}) / 2 and Δ

*P*=

*P*

_{BW}−

*P*

_{same}. Then

*P*satisfying Equation 5 varies as a function of

*P*

_{mean}, for 4 sample values of

*R*. Obviously, larger performance differences are possible when the efficiency advantage of mixed-polarity dots is greater. However, over a very wide range of

*R*including the value of ∼2 reported by Harris and Parker, the difference is maximized if mean performance is around 83–85%. In the experiments reported below, therefore, we tried to adjust the stimulus parameters for individual subjects so as to set mean performance at about this level. Thus, as well as the results below, each subject initially collected a small amount of pilot data at a few different difficulty levels, starting with the zero-noise, 100% correlation condition in order to familiarize them with the task.

*ϕ*=

*π*/2) and cosine phases (

*ϕ*= 0), with a carrier frequency of

*f*= 0.025 cycle per pixel and an envelope standard deviation of

*σ*= 10 pixels:

*x*

_{0}, in the left and right eyes.

*I*

_{L}(

*x, y*) and

*I*

_{R}(

*x, y*), with odd and even receptive fields:

- A tuned-excitatory ODF complex cell:
*C*_{ ODF:TE }(Δ*X*) = (*v*_{ LE }+*v*_{ RE })^{2}+ (*v*_{ LO }+*v*_{ RO })^{2} - A near ODF simple cell:
*C*_{ ODF:NE }(Δ*X*) = (*v*_{ LO }−*v*_{ RE })^{2} - A tuned-excitatory RPC complex cell:
*C*_{ RPC:TE }(Δ*X*) = ([*v*_{ LE }] + [*v*_{ RE }]^{2}+ ([*v*_{ LO }] + [*v*_{ RO }])^{2} - A near RPC simple cell:
*C*_{ RPC:TE }(Δ*X*) = ([*v*_{ LO }] − [*v*_{ RE }])^{2}.

*σ*constant at 3 arcmin as used by Harris and Parker and varied the size of the disparity step so as to obtain average performance at around 0.8 correct.

*σ*= 0) but by reducing the interocular correlation (so now

*C*< 1). We also made the stimulus dynamic, changing every 150 ms. As outlined above, we argued that this stimulus might challenge stereo correspondence more specifically than the disparity-noise stimulus, which is still difficult even when correspondence is perfect. We therefore felt it important to verify that mixed-polarity stimuli continue to have an advantage in this configuration.

*N*= 100). The efficiency ratios are shown in Figure 10. The efficiency ratios are somewhat reduced for the smaller dot number but are still well above 1, especially in the decorrelated stimulus.

*I*

_{L}and

*I*

_{R}(Equation 7). In Figure 11, the gray background was represented by 0, white dots by +1, and black dots by −1. This means that the mixed-polarity images have near-zero DC component and thus that their autocorrelation is zero for large offsets, as is usually assumed for random-dot stereograms (Prince, Pointon, Cumming, & Parker, 2002; Read & Cumming, 2003; Read et al., 2002). This in turn means that the amplitude of the modulation in firing rate as a function of disparity is equal to the baseline response, defined as the mean response to disparities far from the preferred disparity or equivalently to binocularly uncorrelated stimuli (strictly, it is equal to the baseline only for tuned-excitatory or very narrow-band cells; the ratio is slightly less than 1 for odd-symmetric finite-bandwidth cells). For the same-contrast stimuli in Figure 11, the image functions were positive or zero everywhere. Their DC component and, hence, autocorrelation were always positive. This means that the baseline response greatly increases relative to the amplitude of modulation. As we show in 1, the ratio of amplitude to baseline is always maximized if the images have zero DC component. Non-zero DC component, either positive or negative, reduces the amplitude. The amplitude measured with all-white or all-black dots is therefore lower than when measured with mixed black-and-white dots.

*v*

_{L}and

*v*

_{R}are the inner product of each image with that eye's receptive field, as in Equation 7. To compute the average response of this unit over many random images, first consider the average of the last term:

*ρ*

_{L}and

*ρ*

_{R}represent the receptive fields (for this proof, they need not be Gabor functions). For uniform disparity stimuli in which the left and right images are identical apart from a horizontal offset of Δ

*x,*this becomes

*μ*. That is, we can write

*ɛ*is a random variable, picked independently for each

*x*and

*y*from a distribution with zero mean. For example, for a random-dot pattern with black and white dots,

*μ*represents the luminance of the background and

*ɛ*has three peaks: a peak at 0 for background pixels and symmetric peaks on either side of zero for the black and white dots. For an all-white dot pattern with more background pixels than dots,

*μ*is slightly higher than the luminance of the background, and

*ɛ*has two peaks arranged asymmetrically about 0: a small peak at a positive value, representing the white dots, and a larger peak at a negative value closer to zero, representing the gray background.

*x*≠

*x*′ − Δ

*x*or

*y*≠

*y*′), the values of

*ɛ*are uncorrelated, and so the product of the images averages to

*μ*

^{2}. For corresponding points (i.e., where

*x*=

*x*′ − Δ

*x*and

*y*=

*y*′), the values of

*ɛ*are identical, and so there we pick up an additional term that depends on the variance of

*ɛ*:

*v*

_{R}

^{2}〉. Using these results, we can write the mean energy-model response as

*ɛ*

^{2}〉

*B*(Δ

*x*). In the uninteresting case where the images are blank, 〈

*ɛ*

^{2}〉 = 0 and so there is no disparity modulation. Otherwise, the amplitude of the disparity tuning curve relative to the baseline is

*x*

_{pref}is defined as the disparity that maximizes the magnitude of the disparity-modulated term.

*L, R, M,*and Δ

*x*

_{pref}all depend only on the particular receptive field functions, i.e., the properties of the neuronal population encoding disparity. The only term that depends on the image statistics is

*μ*

^{2}/〈

*ɛ*

^{2}〉. This term is multiplied by (

*L*+

*R*), the integral of the receptive field functions. For the special case of odd-symmetric or very narrow-band cells, this integral is zero. In this case, the amplitude ceases to depend on the image statistics and is simply

*A*=

*B*(Δ

_{pref})/

*M*. Where the integral (

*L*+

*R*) is non-zero, it is clear by inspecting Equation A11 that

*A*is maximized when the image has no DC component, i.e.,

*μ*= 0. Then,

*A*=

*B*(Δ

_{pref})/

*M*. Any non-zero value of

*μ*reduces

*A,*the amplitude of the disparity-modulated response. This is the reason for the difference between the mixed- and same-polarity stimuli in Figure 11.