Abstract
How do we infer 3D scene layout from a retinal image without using stereo disparity? We created a scene of two bodies on a ground plane with limbs at different angles, and simpler spatial configurations of rectangular bricks. Images of the scenes were acquired from different camera angles, and each was seen on a large monitor from 5 viewpoints. Observers rotated vectors on a horizontal touch screen to match 3D limb/brick orientations. The perspective projections of 3D orientations form a trigonometric function, that we inverted to derive the back projection for perfectly inferring 3D orientations from retinal images. Inferences about a real scene through a window were simulated by the screen being fronto-parallel to the observer's eyes. Observers' 3D orientation judgments corresponded to a shallower version of the perfect back-projection, suggesting a heavy reliance on 2D retinal orientations, but with a fronto-parallel bias for oblique 3D orientations. Adding one multiplicative parameter to the mathematical back projection, successfully fit the 3D percepts as a function of 2D retinal orientation (R2 = 0.94 – 0.99 across 10 observers). Analyzing 3-D orientation inferences from oblique views of the 2D images, we found that observers used the same rules, but with the assumption that the back-projection function applied to observer centered coordinates causing the 3-D scene to be perceived as rotated towards the observer. A fixed rotation equal to viewing angle, applied to the best fitting function from the fronto-parallel view, successfully fit the average results for oblique viewpoints (R2 = 0.98 – 0.99). The invariance of vertical retinal orientations across viewpoints, explained why the same limbs/bricks were perceived as pointing towards the observer in all views. Since observers seem to use the same inferential rules regardless of viewpoint and perceptual veridicality, we suggest that these rules are used for 3-D perception of real scenes.
Meeting abstract presented at VSS 2018