The horizontal direction is important not only because disparity components in that direction are largest, most common, sufficient, and usually necessary for stereo depth perception, but also for strictly geometric reasons. Ideally, given a pair of fixating eyes, one retinal image of a point in the world will have a counterpart on the other retina that lies within a highly constrained set of positions. Superimposing the retinas would show that the corresponding points lie somewhere along a particular line. Where along this line they lie depends on the point's location in depth; the distance and direction between the image locations is a continuous function of the point's depth relative to the plane of fixation. Thus, this epipolar line, usually identified as “horizontal,” specifies possible true-match retinal locations. In theory, this is important in making retinal correspondence a one-dimensional problem, greatly simplifying the search process aimed at finding a true match. Still, epipolar lines vary with eye position. In practice, therefore, imperfect knowledge of eye position limits the benefits of the epipolar constraint (
Backus, Banks, van Ee, & Crowell, 1999;
Erkelens & Collewijn, 1985;
Regan, Erkelens, & Collewijn, 1986;
Schreiber, Crawford, Fetters, & Tweed, 2001;
van Ee & van Dam, 2003) and is one source of vertical disparities. Other sources include ocular misalignment and differential perspective caused by the greater image size on the retina nearest the image source (
Howard & Rogers, 2002). Vertical disparities have been interpreted both as noise and as subserving ancillary viewing-related functions. These functions include not only ocular alignment but also those that constructively mold our sense of depth. Theoretically, this is a matter of compensating horizontal disparities for viewing geometry, most notably in the perception of surface slant and the extraction of distance information (
Backus et al., 1999;
Banks & Backus, 1998;
Banks, Hooge, & Backus, 2001;
Bishop, 1989;
Brenner, Smeets, & Landy, 2001;
Gårding, Porrill, Mayhew, & Frisby, 1995;
Gillam & Lawergren, 1983;
Longuet-Higgins, 1982;
Mayhew, 1982;
Mayhew & Longuet-Higgins, 1982). Vertical disparity by itself can produce the perception of stereoscopic depth (
Matthews, Meng, Xu, & Qian, 2003;
Ogle, 1938;
Westheimer & Pettet, 1992), but only under restricted conditions; in general, it does not (
Ogle, 1964). The exceptions are mediated by stimuli or detectors that are functionally one dimensional and able to convey oblique disparities.
The epipolar constraint is a standard theoretical underpinning of biological vision models and artificial vision algorithms. Its application to biological vision, however, might be an over-idealization. The epipolar constraint is built on the geometry of points. A true match along an epipolar line is a match between corresponding geometrical points, but geometric points have limited utility for understanding visual processes. They neglect object structure and image redundancies whose recognition has had major influences on the study of stereo vision (
Julesz, 1971;
Marr & Poggio, 1976). The geometry of lines and edges is more pertinent (e.g.,
McKee, 1983), although in some ways it complicates the picture. With lines and edges comes orientation, and with orientation, either of the stimulus or the receptive field, comes the stereo aperture problem (
Farell, 1998;
Morgan & Castet, 1997). One point on a line or edge is much the same as another. Changes in disparity in a direction parallel to the orientation of the line or edge are therefore detectible at the endpoints but not in between. This makes alternative pointwise binocular correspondences possible, blurring the distinction between true and false matches and introducing uncertainty in the direction and amplitude of disparity: the aperture problem. Which correspondence is the effective one in a particular case of biological vision becomes an important question for which the horizontal match is not the only answer.
Figures 1 and
2 illustrate the issues. When cross-fused, the stereogram in
Figure 1A shows what is seen of an oblique line segment positioned behind a segmented occluder. The disparity of the line segment as a whole is horizontal. However, the portion visible through any of the occluder's apertures has a disparity direction that is determined by the aperture's orientation, as seen in the overlaid stereo images in
Figure 1B.
Figure 1C shows the overlaid images of another stereogram, one in which the line has a different disparity direction, vertical in this case. Within the apertures,
Figures 1B and
1C are identical. A mechanism performing local binocular matching within an aperture would detect no difference between the line with horizontal disparity and the line with vertical disparity.
The amplitude of these aperture disparities is a function of the line's orientation, the aperture orientation, and the disparities of the target stimulus and the occluder (
Farell, 1998;
Farell & Li, 2004). The overall unoccluded horizontal disparity of the line segment is carried by the line's endpoints, but elsewhere the line's disparity is locally ambiguous, being compatible with disparity directions spanning 180°. This ambiguity arises whether the disparity of the stimulus is sampled by an occluder or a neuron's receptive field. The line's disparity functions as a constraint line, which defines the set of consistent local disparities.
In
Figure 1, only the match made within the horizontal aperture would be counted as a true match in a conventional inventory of image statistics. The others would be counted as false matches—matches between similar image features that arise from different environmental sources. The properties of aperture disparities are different from those of conventional true-match disparities. They are unconstrained in amplitude and direction, have a flat retinal location distribution, and can have a horizontal disparity component with the opposite polarity from that of the global stimulus. Aperture disparities are generally non-veridical depth cues. Stimuli that are effectively one dimensional at a local level are most favorable for the occurrence of fusible aperture disparities.
1 In cluttered environments, such as the arboreal habitats of many primate species, locally 1-D stimuli—from rod-like shapes such as branches, from the edges of objects, from shadows and their edges—are common and, presumably, so too are aperture disparities of the sort shown in
Figure 1 (see
Mitsudo, Sakai, & Kaneko, 2013).
A related source of aperture-like disparities is 2-D patterns. This can be demonstrated by optically summing a pair of sinusoidal gratings with not-too-different spatial frequencies. A vertical (90°) grating with horizontal disparity might be added to a grating oriented off-vertical by 30°, say. The second grating might have zero disparity, so vertical disparity is found in neither stimulus. Despite their disparity difference, the gratings will not be seen in separate depth planes. What will be seen instead is a depth-coherent plaid (
Adelson & Movshon, 1984;
Delicato & Qian, 2005;
Farell, 1998;
Quaia, Sheliga, Optican, & Cumming, 2013) having a disparity direction of +60° or –120°, depending on the polarity of the disparity of the vertical grating (
Farell & Li, 2004).
Figure 2 illustrates another influence of 1-D components on 2-D pattern disparity. It sketches a pair of 2-D stimuli (plaids, each composed of two schematic gratings, with one pair oriented at 75° and 105° and the other at 30° and 150°). Each appears doubled, with left- and right-eye views superimposed. This shows that both plaids, like the line segment in
Figure 1A, have horizontal disparity,
D and 1.93
D in these cases. We could measure threshold disparity for these plaids, the smallest amplitude that can be reliably distinguished from zero. If we scaled the disparities of the two stimuli proportionally, we might expect to find a point where the plaid on the right was seen in depth relative to the background, whereas the plaid on the left would not differ perceptibly from the background depth. Threshold might be 1.5 times
D, say; one plaid's disparity would be above threshold and the other's below. As will be discussed later, this point would not be found. Both plaids would be at threshold when their horizontal disparities differ by nearly a factor of two, as illustrated in the figure (
Farell, 2003). What is constant at threshold is not horizontal disparity, but rather ϕ, the phase disparity of the 1-D components (and not because the disparity of only one component had been detected). These component threshold disparities are equivalent to the thresholds of the individual gratings that make up the plaids. Horizontal disparity does matter for stereo depth of 2-D stimuli at suprathreshold levels. At threshold, though, the question is, in light of
Figure 1, what is the disparity of 2-D stimuli?
Figure 1 uses 1-D contours to illustrate the aperture problem in stereo vision.
Figure 2 raises the question of the influence of 1-D components and their aperture problem on the processing of 2-D stimulus disparity. The aperture problem highlights the potential ambiguity of disparity direction and its orientation specificity, despite the epipolar constraint. Psychophysical data considered below suggest that both perpendicular and horizontal disparities are used in computations of relative disparity, and their use differs between 1-D and 2-D stimuli. In what follows we explore these issues and how they impact the horizontal-centric view of disparity processing. We look into this from the perspective of psychophysical and physiological evidence. The close connection with motion processing—in particular, the component versus pattern motion distinction (
Adelson & Movshon, 1982;
Rust, Mante, Simoncelli, & Movshon, 2006)—will be evident. Motion is largely isomorphic with stereo (e.g.,
Marr, 1982;
Qian & Andersen, 1997), the crucial difference being that frontoparallel motion is basically isotropic and disparity-derived depth is decidedly anisotropic. This difference shapes how disparity is processed.
Together with data to be discussed later, the illustrations in
Figures 1 and
2 point to the potential of orientation of both stimuli and receptive fields to influence the effective disparity direction. Quite a number of studies, which we will consider in overview first, address the contribution of receptive field orientation. To preview, despite this number, physiological data converge on limited trends rather than a consensus.