Abstract
The stereo pair of a human’s eyes is spatially separated and their retinal images of a 3D scene are slightly different from one another. This difference is referred to as binocular disparity. Humans can perceive depth within a scene by using binocular disparity. Many believe that the perception of depth requires oculomotor information about the relative orientation of the two eyes. The visual system can obtain such oculomotor information from the efference copy of the oculomotor signal, or from the spatial distribution of the vertical disparity, specifically, the vertical component of the binocular disparity. Note, however, that the oculomotor information provided by these two sources is too restricted or unreliable to explain the reliable depth perception humans have under natural viewing conditions. In this paper, I will describe a computational model that can recover depth from a stereo-pair of retinal images without making any use of oculomotor information or a pirori constraints. This depth recovery model is based entirely on the geometry of the optics of the eyes, which means that I am treating this as a Direct rather than an Inverse problem, according to Inverse Problem Theory. The input to this model is the stereo-pair of retinal images represented as two sets of visual angles between pairs of points in the individual retinal images. The recovered depth is represented in a head-centered coordinate system. Both the representations of the retinal images and of the recovered depth do not change when the eyes rotate. Oculomotor information can be recovered after depth is recovered if it is needed. Note that the model proposed in this study can explain many psychophysical results better than the conventional formulation, which assumes that the visual system requires oculomotor information to use binocular disparity to perceive depth.