Abstract
A 3D object seen from different views, forms quite different retinal images. There exists a single viewpoint from which a photograph forms the same retinal image as the physical 3D scene, but other viewpoints generate distorted retinal images of the scene. Koch et al (PNAS 2018) showed that humans are almost perfect at estimating 3D poses in real scenes by using the optimal geometrical back-transform from retinal orientation to 3D pose, albeit with a systematic fronto-parallel bias. In oblique views of pictures, the scene is perceived as rigidly rotated to the observer’s viewpoint, consistent with their using the same observer-centered back-transform as for real scenes. However, oblique views do lead to changes in perceived shape and sizes. We showed previously (VSS 2019) that size inconstancy is perceived in 3D scenes despite observers using the correct geometric back-transform, if the retinal image evokes a misestimate of viewing elevation. Now we examine 3D size estimation in oblique views of pictures. We presented 4 different oblique views of pictures of 3 sizes of rectangular parallelepipeds lying on the ground in 16 poses each. 6 observers adjusted the height of a view-invariant orthogonally attached narrow cylinder to equate the physical lengths of the two limbs. 3D sizes at fronto-parallel poses were seriously underestimated in oblique views compared to the frontal view. By contrast, there was almost no change for objects perceived as pointing to or from the viewer. Observers’ corrections for size, as a function of pose, were modeled with the optimal geometric back-transform, subject to a systematic underestimation of the tilt of the display, which was confirmed by perceived display tilt measurements. The excellent fit of the model shows that observers use the back-transform from projective geometry, but underestimate the tilt of the display, similar to the fronto-parallel bias for object pose perception.