Each of our eyes views a scene from a slightly different position. The resulting binocular disparities enable us to reconstruct the 3-dimensional (3D) lay-out. The processing of disparities is, however, not essential for the 3D reconstruction because we are often able to perceive depth solely on the basis of monocular vision. For example, monocular perspective (including texture, outline, and linear perspective) is a powerful cue for surface slant (
Clark, Smith, & Rabe, 1955;
Cutting & Millard, 1984;
Freeman, 1966;
Stevens, 1981). How much depth do we perceive when viewing a depiction of a slanted plane in which binocular disparity and monocular perspective provide opposite slant information? Recently, we examined this question in a metrical (quantitative) way and we found for a range of disparity-perspective cue conflicts that observers experience bi-stability when viewing such depictions (
van Ee, Hol, & Erkelens, 2001). Although, quite interesting, phenomenological aspects of bi-stability in stereoscopically perceived slant were reported in the early days of stereoscopic research, little progress seems to have been made since then, and the metrical aspects have never been investigated systematically.
The literature on perceptual bi-stability is vast. However, almost all demonstrations of bi-stability are essentially monocular, even when they are viewed binocularly.
Figure 1 shows the well-known Necker cube, which is an example of a stimulus that evokes perceptual bi-stability. The literature on bi-stability that requires stereopsis is surprisingly sparse, even though quite a few studies have addressed conflicts between monocular and binocular specified depth (see “Discussion”). A survey of the literature reveals interesting findings. First, as far as we know, only two studies have reported that bi-stability occurs in slant perception for extreme disparity-perspective cue conflict situations (
Wheatstone, 1852;
Schriever, 1925). These studies were phenomenological in nature and did not address metrical aspects of perceived slant. Wheatstone, in particular, reported bi-stability for a variety of different 3D stimuli in which perspective and disparity provided opposite depths.
1 Second, a couple of studies did examine estimated slant when disparity and perspective provide opposite slant information but they did not report bi-stability (
Allison & Howard, 2000a;
Allison & Howard, 2000b;
Gillam & Cook, 2001).
In sum, there seem to be no studies in the literature that investigated how much depth is perceived (i.e., the metrical aspects) in stimuli that engender bi-stability. On the phenomenological aspects, however, Wheatstone (
1838,
1852; i.e. over 150 years ago) reported a wealth of information about and insights into bi-stability. Because many of his findings are relevant for our study, we will use them as a central thread through this introduction.
Wheatstone, using the stereoscope that he constructed, was one of the first to study stimuli in which binocular disparities and monocular perspective provided opposite slant information (
Wheatstone, 1838, p. 377): “A very singular effect is produced when the drawing originally intended to be seen by the right eye is placed at the left hand side of the stereoscope, and that designed to be seen by the left eye is placed on its right hand side. A figure of three dimensions, as bold in relief as before, is perceived, but it has a different form.” He called this the “converse figure” (1838) or “conversion of relief” (1852); nowadays, we call it “reverse perspective” (reviewed in
Howard & Rogers, 2002). “Those points which are nearest the observer in the proper figure are the most remote from him in the converse figure” and he continues, “but it is not an exact inversion, for the near parts appear smaller, and the remote parts larger than the same parts before inversion (
Wheatstone, 1838, p. 377).”
2 And then he explains that in the case of simple line drawings, the reverse perspective figure is “as readily apprehended as the original one, because it is generally a figure of a frequent occurrence.” He also states that the reversals “seem entirely to depend on our mental contemplation of the figure intended to be represented, or of its converse.” In the Bakerian Lecture (
Wheatstone, 1852, p. 14), he is extraordinarily explicit about the occurrence of bi-stability (which he calls “the two ideas in the mind”) in binocular vision
3: “I know of nothing more wonderful, among the phenomena of perception, than the spontaneous successive occurrence of these two different ideas in the mind, while all external circumstances remain precisely the same,” and he goes on to state that an object “becomes converted into another totally dissimilar object uncouth in appearance, and which gives rise to no agreeable emotions in the mind; yet in both cases all the sensations that intervene between object reality and ideal conception continue unchanged.”
Figures 2 and
3 demonstrate the two 3D percepts that observers are able to distinguish when (monocular) perspective and (binocular) disparity specify very conflicting slants: one percept in which the grid’s slant is positive (
Figure 3b) and the other in which the slant is negative (
Figure 3c). The two percepts are never present simultaneously.
Most observers with normal stereovision have no difficulty in focusing their attention on either of the two 3D percepts. However, during pilot studies and during presentations at conferences, we have asked at least 60 observers to report their perceptions while viewing ambiguous slant stimuli; as in many other studies in binocular depth perception, we found considerable variability between observers (reviewed in
Howard & Rogers, 2002). Some of the observers were able to perceive both the perspective and the disparity-dominated percept (bi-stability), some observers perceived solely the perspective-dominated percept, and some solely the disparity-dominated percept (see also
Stevens, Lees, & Brookes, 1991, for the same finding in a comparable study for surface curvature).
Roughly speaking, about 30% of the 60 pilot observers tested were able to perceive both the perspective-dominated and the disparity-dominated percept directly. The other 70% of the observers initially perceived solely the perspective-dominated percept (even if they knew that bi-stability would be possible). Only after they had been told they were looking at a stimulus that they could see in reversed perspective were they able to perceive bi-stability. About 10% to 20% of the 60 observers kept seeing solely the perspective-dominated percept even after they had been coached in trying to perceive the disparity-dominated percept. Two observers (very experienced colleagues in stereo vision research, but not the authors) perceived solely the disparity-dominated percept, and they were unable to alternate between the disparity- and the perspective-dominated percept.
Bi-stability in stereoscopic vision is an interesting phenomenon because it creates the rare opportunity of having two states in neural processing that are related to the percepts rather than to the stimulus. To enable future theoretical analyses on how both perspective- and disparity-specified slant contribute to bi-stable 3D perception, we collected systematic data on metrical aspects for a broad spectrum of possible combinations of disparity- and perspective-specified slants. We asked observers to view ambiguous stereoscopic images in which both disparity and perspective specified different orientations of a grid in 3D space. Grid rotation was about the vertical axis, and we manipulated perspective and disparity independently.