Abstract
In natural scenes, most points are binocularly visible, but many are half-occluded. Half-occlusions occur because the left eye can image regions of the scene around the left side of objects that the right eye cannot and vice versa. Half-occluded points contribute to Da Vinci stereopsis and are useful for segmenting images into different depth planes. However, there is no consensus for how the visual system accomplishes this feat. We examine half-occlusion detection performance for three local cues in natural images. First, we obtained a large database of calibrated stereo images with precisely co-registered laser range data. Next, we developed procedures to i) identify half-occluded points and ii) sample binocularly visible (corresponding) points with arcsec precision. We randomly sampled 100,000 points from the dataset and computed the interocular difference of three image properties in the immediate neighborhood (0.25deg) of the sampled point: local luminance, contrast, and contrast-contrast (change in contrast). Then, we conditioned these interocular differences on whether they corresponded to half-occluded or binocularly visible points; the likelihood of a half-occlusion increases with interocular difference. To quantify performance, we computed the log-likelihood ratio, used it as a decision variable, and computed d-prime. As expected, interocular contrast difference is a strong predictor of half-occlusions (76% correct; d'~=1.4). Perhaps more surprising, interocular luminance difference is an equally good predictor of half-occlusions (76% correct; d'~=1.4); interocular contrast-contrast difference is not much worse (71%; d'~=1.2). Further, all three statistics are largely independent. We estimated the joint conditional probability distributions of the cues and repeated the procedure above. When all three cues are used, performance climbs to (~84%; d'~=2.0). Thus, three very simple local cues can provide surprisingly good information about whether a given point is half-occluded or binocularly visible. Future work will examine how to optimally combine local cues across space.
Meeting abstract presented at VSS 2016