Abstract
The traditional approach in studying depth perception is to measure and model performance for simple stimuli containing just one or two maximally isolated depth cues. While this approach has proven informative, it is incomplete. Visual systems are presumably optimized for recovering depth from complex natural stimuli by simultaneously combining many depth cues. Thus, observers are likely sensitive to the rich and sometimes counter-intuitive correlations between features in natural images. This sensitivity could both facilitate and hinder performance when discriminating simple stimuli lacking this natural structure. These considerations motivated us to pursue a complementary approach in which stimuli begin rich with naturalistic variation and experimental control is achieved through systematically applied constraints. Here we present critical baseline measurements of human depth discrimination in "cue-complete" scenes. Using a robotically positioned laser range finder coupled with a calibrated DSLR camera, we collect stereoscopic natural images that are precisely co-registered with pixel-wise range data. These images are cropped to be geometrically consistent with viewing from the camera’s positions through a simulated window. That is, an active stereoscopic projection approximately 1.5m wide, 0.5m high, and 3m away (with 60mm separation). Note that depths so large are unlikely to generate large conflicts between vergence and accommodation, nor are the patterns of defocus likely to substantially differ from natural viewing. Observers judged the nearer of two indicated locations falling on objects within the scene. Both fixation and response time were unconstrained. Conditions were parametrically and fully sampled along three dimensions: (i) the observer’s distance to the nearer point, (ii) the visual angle separating the two points, and (iii) the difference in distances to the two points. Extensive data collected on an author judging thousands of locations in a preliminary set of 10 stereo images suggests that accuracy across the space can be well described with only three parameters.
Meeting abstract presented at VSS 2013