We manually selected 48 grayscale stereoscopic outdoor scenes (Hoyer & Hyvärinen,
2000) that contained mountains, trees, water, rocks, bushes, etc., but avoided manmade objects (examples of these images are shown in
Figure 1). We did not have the ground truth disparity data from these scenes. Instead, we relied on a simple yet biological inspired method to approximate ground truth disparities. Models of binocular complex neurons (Anzai, Ohzawa, & Freeman,
1999; Fleet, Wagner, & Heeger,
1996; Ohzawa, DeAngelis, & Freeman,
1990; Qian,
1994) commonly contain a cross-correlation term. For example, a binocular complex cell's response can be expressed as the sum of the squares of two quadrature simple cell responses, while the simple cell responses sum the dot products of the receptive field and the image from both eyes:
Here
S AB denotes inner products between left/right eye images and even/odd responses of simple cell receptive fields, where
A ∈ {
L, R} indicates left or right eye and
B ∈ {
E, O} indicates even or odd symmetry. The responses of even symmetric and odd symmetric simple binocular cells (quadrature pairs) are denoted by
r 1 and
r 2, respectively. In
Equation 3, the last two terms
S LE S RE and
S LO S RO are the cross-correlation between band-pass left image and right image. In the model, the outputs of a binocular complex neuron largely depend on this term. Aside from neurophysiological evidence, psychophysical studies (Cormack, Stevenson, & Schor,
1991) also showed that interocular correlation is a decisive factor in stereopsis. Furthermore, local correlations have been extensively used to resolve correspondences in numerous computational stereo algorithms too (for a good review, see Brown, Burschka, & Hager,
2003; Scharstein & Szeliski,
2002). Inspired by this neurobiological basis of stereopsis, Filippini and Banks (
2009) built a local correlation stereopsis model. They conducted psychophysical experiments with human observers and compared human results with the results of their model under the same experiment setup. Using this model, they explained two well-known constraints of human stereopsis: the disparity-gradient limit, which is the inability to perceive depth when the change in disparity within a region is too large, and the limit of stereoresolution, which is the inability to perceive spatial variations in disparity that occur at too fine a spatial scale.