Abstract
In computer vision, applications that previously involved the generation of 3D models can now be achieved using view-based representations. In the movie industry this makes sense, since both the inputs and outputs of the algorithms are images, but the same could also be argued of human 3D vision. We explore the implications of view-based models in our experiments.
In an immersive virtual environment, observers fail to notice the expansion of a room around them and consequently make gross errors when comparing the size of objects. This result is difficult to explain if the visual system continuously generates a 3-D model of the scene using known baseline information from interocular separation or proprioception. If, on the other hand, observers use a view-based representation to guide their actions, they may have an expectation of the images they will receive but be insensitive to the rate at which images arrive as they walk.
In the same context, I will discuss psychophysical evidence on sensitivity to depth relief with respect to surfaces. The data are compatible with a hierarchical encoding of position and disparity similar to the affine model of Koenderink and van Doorn (1991). Finally, I will discuss two experiments that show how changing the observer's task changes their performance in a way that is incompatible with the visual system storing a 3D model of the shape or location of objects. Such task-dependency indicates that the visual system maintains information in a more ‘raw’ form than a 3D model.