Abstract
There are two sources of depth information in a stereo pair. One is the correlation signal from smooth surface regions that are visible to both eyes, which provides depth information via triangulation. The other is the decorrelation signal near occluding contours, which provides information about the locations and amplitudes of depth discontinuities by the half-occlusions they induce. A variety of perceptual stimuli have convincingly demonstrated the active roles of both signals in stereopsis. In computational vision, there has been less progress in using the decorrelation signal. For example, one common approach is a "left-right consistency check," which uses correlation to estimate two separate depth maps from the left and right viewpoints, and then reasons about half-occlusions based on where these two depth maps disagree. This strategy can succeed in practice, but it breaks down and is entirely inconsistent with human perception when applied to stimuli with limited correlation cues (e.g., those of Nakayama and Shimojo [1990]). We have developed a computational approach that uses decorrelation more effectively. The key ideas are to incorporate local detectors for the half-occlusion boundaries within the visual field, and to combine the responses from these detectors with correlation information using a piecewise-smooth representation of disparity. Our half-occlusion boundary detectors are based on the spatial gradient of the correlation signal, and they are inspired by the binocular-monocular receptive fields proposed by Anderson and Nakayama [1994]. Our approach is formulated as energy minimization along 1D epipolar scanlines, using an objective function that can be globally optimized by dynamic programming. We tested the algorithm on a collection of twelve perceptual stimuli that have weak correlation cues. We found that the disparity profiles that minimize our energy match human perception. We also found that our use of decorrelation cues improves disparity accuracy in half-occluded regions of natural images.
Meeting abstract presented at VSS 2018