We have demonstrated quantitative depth arising from monocular regions attached to binocular surfaces. A monocular region on the temporal side of the binocular surface was perceived to be located behind the binocular surface. This is consistent with occlusion geometry, in which part of the background is hidden from one eye's view by a nearer surface (
Figure 1a). The depth of temporal monocular regions closely followed the minimum depth constraint. However, in our experiments a nasal monocular region was not perceived in near depth; this would only be expected if the conditions for monocular camouflage were satisfied (
Figure 1c), which was not the case for our stimuli. Instead, nasal monocular regions were perceived at the same depth as the binocular surface accompanied by a phantom occluder at a near depth, geometrically accounting for the absence of the monocular region in the other eye (see
Figure 1d). The depths of the phantom surfaces were quantitative, and closely followed the depth predicted by the geometry illustrated in
Figure 1d. This is the first time that quantitative depth has been measured for a phantom occluder in this form of da Vinci arrangement. Importantly, the results of
Experiment 2 demonstrate a qualitative dissociation between nasal and temporal monocular regions. Temporal monocular regions appear at a far depth, whereas nasal monocular regions appear at the same depth as the binocular surface, with a phantom surface perceived in near depth. We do not believe that existing models of the processing of monocular regions in binocular depth perception can account for this combination of results. We elaborate on the issues below.
Several stereo models incorporate the depth of monocular regions (e.g., Assee & Qian,
2007; Grossberg & Howe,
2003; Hayashi, Maeda, Shimojo, & Tachi,
2004; Watanabe & Fukushima,
1999). They model the situations shown in
Figures 1a (occlusion) and
1b (aperture), but not
1c (camouflage) and
1d (phantom). Although Assee and Qian (
2007) illustrate the phantom resolution to refute the idea that nasal monocular regions are necessarily invalid, and it is mentioned as a possibility by Watanabe and Fukushima (
1999), neither of these authors include it in their models. In general, the models match regions of binocular texture, locate unpaired points that correspond to monocular regions, and then assign the monocular regions to the depth of the background surface, following the observation of Julesz (
1971). It is assumed that monocular regions represent part of a continuous background surface. In our experiments, temporal monocular regions are perceived behind the occluding binocular surface, but there is no background surface. Assee and Qian (
2007) develop the most physiologically plausible model of da Vinci stereopsis, but like other models theirs requires the presence of a background surface for locating monocular regions in depth. These authors recognize that this requirement creates a difficulty in explaining the depth of monocular regions when the background surface is featureless, as is the case in our stimuli. They deal with this “atypical” situation by attributing the depth to Panum's limiting case or double fusion of a monocular line with two binocular lines. Given this claim and also given that no current models attempt to explain the phantom occluder, it is worth considering matching in our stimuli, and the degree to which disparity based on matching processes can account for our results.
It can be seen from
Figure 1d that the location of the phantom occluder in depth could be specified by two matches: (a) between the outer edge of the monocular texture (a luminance edge) and the outer edge of the binocular texture in the other eye (also a luminance edge) and (b) between the edges of the binocular texture in both eyes, only one of which is a luminance edge. This would imply a form of double matching, since the edge of the binocular texture in the eye without the monocular region is involved in both matches. However, given that one of the matches is not between luminance edges it would be an unusual form of Panum's limiting case. It should also be noted that matching of the entire vertical border is not necessary as shown in the example stereogram of
Figure 10. Here a phantom occluder is still perceived for a nasal monocular region that is half the height of the binocular surface.
Although the above form of Panum's limiting case could account for the depth of the temporal monocular region, it does not account for the perception of the nasal region. In this case the depth of the phantom edge is perceived as near (a possible form of double matching as described above), but the monocular texture is separated from it and seen at the depth of the binocular surface. It is clear that Panum-type matches are interpreted differently on the nasal and temporal sides of a binocular surface. This dissociation can only occur because the geometry of occlusion is incorporated into the interpretation via mechanisms that are not yet understood.
Overall we think it likely that our results are attributable to an interaction between binocular matching and the implementation of occlusion geometry. These two forms of constraint are involved to different extents in the depth perceived in experimental stimuli with monocular regions. At one extreme is the phantom rectangle (Gillam & Nakayama,
1999; Grove, Gillam, & Ono,
2002; Kuroki & Nakamizo,
2006; Mitsudo, Nakamizo, & Ono,
2005), which generates quantitative depth based on monocular regions despite a complete absence of disparities in the stimulus. At the other extreme are stimuli initially thought to demonstrate depth from monocular regions where the depth has since been attributed to matched disparate features (Liu, Stevenson, & Schor,
1994,
1997). There is a final category in which the depth is clearly due to an interaction. For example, in monocular gap stereopsis depth is seen at the monocular gap even if there is no disparity at the stimulus edges, but the magnitude and precision of the depth is influenced by the disparity there (Pianta & Gillam,
2003). Although our stimuli are very different from this we believe they also involve an interaction.
There is another example in the literature that may involve an interaction between matching and occlusion geometry, in which the presence of monocular regions influences the perception of a binocular surface (Tsirlin, Wilcox, & Alison,
2010). In this case the surface was a white rectangle surrounded by a binocular frame of random dot texture. Inside the frame and adjacent to the white rectangle was a square of random dot texture with near disparity. The white rectangle was visible in the monocular half-images and had luminance-defined edges, thus it was not a phantom surface. However, the addition of monocular regions of texture to the display influenced the percept of the white rectangle in depth. When monocular strips of texture were added to the nasal side, the white rectangle appeared in near depth, in front of the binocular frame. When monocular regions were added to the temporal side, the white rectangle appeared behind the binocular frame. The depth of the white rectangle in both cases was quantitative, as measured by a depth-matching task. However, because this depth was also predicted by disparity in the stimulus (either by edge-matching or the size disparity of the white rectangle), one side of the binocular frame was removed in an additional experiment to control for binocular matching. In this experiment the depth was no longer quantitative. This suggests that edge-matching was involved in the quantitative depth perceived in the original display. In a final experiment using only occlusion conditions, quantitative depth was restored in the stimuli missing one edge of the binocular frame when the random-dot square was placed at the same disparity as the binocular frame. In this case the white rectangle appeared slanted, which suggests the involvement of size disparity that was also present. The interaction between matching and occlusion geometry is complex in these stimuli, because in all of the experiments the introduction of monocular regions also created a size disparity in the white rectangle, and a position disparity of the white rectangle relative to the binocular frame. Thus it is difficult to separate the contribution of matching and occlusion geometry to the depth perceived.
Our results also suggest an interaction between occlusion geometry and edge matching. However, our stimuli are very different because the phantom surface is not defined in the monocular half-images, and is only present with binocular viewing and when the monocular region of texture is added to the nasal side. Although the white rectangle is referred to as an “illusory occluder” (Tsirlin, Wilcox, & Allison,
2010), it has luminance-defined edges that are visible in each monocular half-image. Unlike our phantom occluder, the white rectangle is visible in both temporal and nasal conditions in their experiments. The phantom surface in our stimuli is only visible with camouflage geometry. Thus we have demonstrated that a phantom occluding surface with quantitative depth is employed by the visual system to resolve nasal monocular regions when camouflage is prevented, as proposed by Assee and Qian (
2007).
There is substantial evidence that edge-matching can influence the depth of intervening texture (McKee & Mitchison,
1988; McKee, Verghese, & Farell,
2004; Mitchison & McKee,
1987a,
1987b). For patterns with ambiguous matches such as sinusoidal gratings, the surface of the texture is perceived at the depth specified by the edge disparities. This is not the case for the stimuli we have investigated, in which the depth is perceived in a phantom contour, instead of in the texture that defines the edges. The presence of the monocular region in this case indicates a depth discontinuity, evidence that the binocular and monocular regions are not part of the same surface. The independence of depth processes for edges and texture could be ecologically useful for dealing with occlusion in natural viewing. Unmatched local regions of texture occur next to occluding surface edges; thus it is adaptive to have an edge-matching process that is independent of the process that assigns depth to the texture.
The contribution of occlusion geometry to the global resolution of monocular regions has been demonstrated in another context involving edge matching. As described earlier, Gillam and Grove (
2004) demonstrated that phantom contours can be elicited by ambiguous horizontal disparity.
Figure 11 shows a stereogram taken from their paper; in one eye's view the ends of the lines are vertically aligned on both sides, whereas in the other eye the line lengths are truncated along a diagonal on the nasal side. When fused, a slanted phantom occluder is perceived in depth and the lines appear flat. If the phantom occluder were produced purely by edge matching of the vertical edge in one eye and the diagonal edge in the other eye it should persist when the eyes are reversed. However, when the eye's views are reversed, no phantom occluder is perceived, and the individual lines appear at different slants as predicted by local horizontal disparity. This dissociation is important; it shows that the phantom occluder involves global resolution of the scene, which must be consistent with occlusion geometry, and thus it is not simply the result of edge matching. Seeing phantom occluders to account for certain patterns of binocular information raises the same issues as for other forms of subjective contours. It is unclear whether they are always present in particular contexts and only revealed when luminance contours are removed, or whether they are only “created” to account for the situation when luminance contours are absent.
A possible neural mechanism for da Vinci stereopsis has been identified in a class of V2 cells, which respond to disparity-defined edges (von der Heydt, Zhou, & Friedman,
2000). These cells are used as the basis of Assee and Qian's (
2007) model of da Vinci stereopsis. Some of these cells are also orientation-tuned, or selective for the direction of the depth step. These results are of interest because this class of cells responds to the depth step in random-dot stereograms, in which a monocular region of texture is necessarily adjacent to the depth step. Thus it is unclear whether the cells are responding to the depth discontinuity given by the disparity change, the presence of the monocular region, or both. Gillam and Borsting (
1988) suggested that monocular regions help locate disparity discontinuities, whereas most models view monocular regions as a
consequence of disparity discontinuities. It would be interesting to know the extent to which monocular regions influence the cells' responses; hopefully this question will be addressed by future physiological research.