Free
Research Article  |   February 2009
Image-size differences worsen stereopsis independent of eye position
Author Affiliations
Journal of Vision February 2009, Vol.9, 17. doi:10.1167/9.2.17
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Björn N. S. Vlaskamp, Heather R. Filippini, Martin S. Banks; Image-size differences worsen stereopsis independent of eye position. Journal of Vision 2009;9(2):17. doi: 10.1167/9.2.17.

      Download citation file:


      © 2015 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements

With the eyes in forward gaze, stereo performance worsens when one eye's image is larger than the other's. Near, eccentric objects naturally create retinal images of different sizes. Does this mean that stereopsis exhibits deficits for such stimuli? Or does the visual system compensate for the predictable image-size differences? To answer this, we measured discrimination of a disparity-defined shape for different relative image sizes. We did so for different gaze directions, some compatible with the image-size difference and some not. Magnifications of 10–15% caused a clear worsening of stereo performance. The worsening was determined only by relative image size and not by eye position. This shows that no neural compensation for image-size differences accompanies eye-position changes, at least prior to disparity estimation. We also found that a local cross-correlation model for disparity estimation performs like humans in the same task, suggesting that the decrease in stereo performance due to image-size differences is a byproduct of the disparity-estimation method. Finally, we looked for compensation in an observer who has constantly different image sizes due to differing eye lengths. She performed best when the presented images were roughly the same size, indicating that she has compensated for the persistent image-size difference.

Introduction
To estimate binocular disparity, the visual system must determine which parts of the two retinal images correspond. There is now substantial evidence that the primary computation in disparity estimation is cross-correlation of the two images. The evidence comes from successful modeling of human vision (Banks, Gepshtein, & Landy, 2004; Cormack, Stevenson, & Schor, 1991; Filippini & Banks, 2009; Fleet, Wagner, & Heeger, 1996; Harris, McKee, & Smallman, 1997) and visual cortex (Cumming & DeAngelis, 2001; Ohzawa, 1998; Ohzawa, DeAngelis, & Freeman, 1990), and from successful applications in computer vision (Kanade & Okutomi, 1994). High correlations, and hence reliable estimates of disparity, are obtained when the two eyes' images are similar, and lower correlations and less reliable disparity estimates when the images are dissimilar. 
The problem of interest in the current paper is how the position of an object relative to the head affects the ability to estimate disparity. For objects in the visual plane (the plane containing the eye centers and the fixation point), the ratio of image sizes for a gaze-normal patch is:  
S = ( X i 2 ) 2 + Z 2 ( X + i 2 ) 2 + Z 2
(1)
where i is inter-pupillary distance, object position is ( X, Z), and left- and right-eye positions are (− i/2,0) and ( i/2,0). Figure 1 plots the relative sizes of the two eyes' images as a function of the head-centric position of an object. The object is a small surface patch that is perpendicular to the line of sight (slant = 0°). The circles are iso-magnification contours representing object positions for which one eye's image is a particular percentage larger than the other eye's image. This figure shows that large relative magnifications occur in natural viewing of near eccentric objects. This creates an interesting problem for disparity estimation via correlation: when the two images have different sizes, the correlation between images is necessarily reduced. Does this mean that stereopsis exhibits deficits for near eccentric stimuli? Or does the visual system have a mechanism that compensates for the predictable image-size differences associated with such viewing?
Figure 1
 
Iso-magnification contours. This is a plan view of the plane containing the eyes and the fixation point. The simulated stimuli are small gaze-normal patches. The contours represent the regions in space for which one eye's image of a patch is a given percentage larger than the other eye's image.
Figure 1
 
Iso-magnification contours. This is a plan view of the plane containing the eyes and the fixation point. The simulated stimuli are small gaze-normal patches. The contours represent the regions in space for which one eye's image of a patch is a given percentage larger than the other eye's image.
 
Several studies have examined the effects of image-size differences when the eyes are in forward gaze (horizontal version = 0°). As the size difference increases, stereoacuity becomes systematically poorer and the largest disparity supporting a depth percept decreases (Highman, 1977; Jiménez, Ponce, del Barco, Díaz, & Pérez-Ocón, 2002; Lovasik & Szymkiw, 1985). If image size differs too much, stereopsis breaks down altogether (Highman, 1977; Lovasik & Szymkiw, 1985). Large meridional differences in image size (i.e., horizontal or vertical) also disrupt stereopsis (Blakemore, 1970; Burt & Julesz, 1980; Fiorentini & Maffei, 1971; Ogle, 1950). 
Ogle (1950) realized that large size differences occur naturally with objects that are close to the head and positioned at large azimuths. For fixated and gaze-normal surface patches, Equation 1 can be rewritten as: 
S=cos(Rl)cos(Rr)
(2)
where Rl and Rr are the horizontal rotation angles of the left and right eye, respectively. Ogle proposed that when the eyes are fixating eccentrically, the retinal image in the nasally turned eye (the eye receiving the smaller retinal image) is magnified “psychologically” relative to the other eye's image; such magnification would be the reciprocal of the magnification in Equation 2. Ogle made no claim about where in visual processing the neural magnification occurs. We reasoned that if the hypothesized neural magnification occurred before the stage of disparity estimation (i.e., before correlating the two eyes' images), it would reduce the difference in the sizes of the represented images and thereby increase the reliability of disparity estimates. As a consequence, one would predict that the deterioration of stereopsis that accompanies large differences in image size would not occur when the size differences are compatible with eye position. Ogle presented experimental evidence that the hypothesized neural magnification occurs and that the trigger for the magnification is an extra-retinal, eye-position signal. Specifically, dichoptic images of different shapes, but the same retinal size appeared to differ in size when the eyes were in eccentric gaze (Ames, Ogle, & Gliddon, 1932; Herzau & Ogle, 1937; Ogle, 1939).1,2 
Figure 1 shows the size ratios that occur with gaze-normal surfaces at different positions relative to the head. From the figure, we cannot determine how commonplace various size ratios are because the probability of a given ratio depends on the probability of surfaces being presented at different positions relative to the head and on the probability of different surface orientations. We are not aware of measurements of the probability distributions for different head-centric positions, but there is good evidence concerning the most probable surface orientation. A gaze-normal surface (slant = 0°) is the most likely to stimulate a patch of retina (Hillis, Watt, Landy, & Banks, 2004). The reason is the following. The distribution of surface slants in the world is presumably uniform, at least for tilts near 0° (the distribution is non-uniform for tilts near 90° because the ground plane is often visible; Potetz & Lee, 2003). If the distribution in the world is uniform, the probability of observing a particular slant at the retina will be proportional to the cosine of that slant because of the perspective projection of the eye. Said another way, gaze-normal surfaces project to larger retinal images than steeply slanted surfaces. Thus, the most common surface slant at all azimuths is 0°. Accordingly, for surfaces positioned eccentrically relative to the head, the mostly likely slant yields size ratios different from 1. Therefore, Ogle's hypothesized compensation would be useful in natural eccentric viewing if it minimized the average image-size difference at the two eyes. And this in turn would guarantee that disparity estimation was most precise for the most likely surfaces. 
We asked if the visual system has adopted a compensation mechanism like the one proposed by Ogle to deal with image magnification due to eccentric gaze. Specifically, we examined whether the reliability of disparity estimation is immune to image-size changes that occur naturally with changes in eye position. 
Experiment 1
Methods
Observers
Four observers, ages 22–32 years, participated. All had normal visual acuity and stereoacuity. Three were unaware of the experimental hypotheses; the fourth was author BNV. All were experienced psychophysical observers. 
Apparatus
Observers viewed the stimuli in a haploscope (Backus, Banks, van Ee, & Crowell, 1999); the room was otherwise dark. Images were projected to the eyes from two CRTs, one for each eye (Viewsonic g225f, 2048 × 1536 pixels, pixel size & 1.6 arcmin, refresh rate = 75 Hz). Each CRT was mounted on an arm that rotated about a vertical axis that was co-linear with the vertical rotation axis of the appropriate eye. The distance from the cornea to the mid-point of the CRT was 39 cm, and a line from the eye to the mid-point of each display was perpendicular to the surface of the CRT. The eyes were correctly positioned with respect to the apparatus by adjusting a custom bite bar using a sighting device (Hillis & Banks, 2001). Once correctly positioned, the retinal images were not altered when the haploscope arms were rotated to create the stimulus for different eye positions. 
Stimuli were generated and presented using the Psychtoolbox (Brainard, 1997; Pelli, 1997). To produce small disparities accurately, the stimuli were anti-aliased and the CRTs were spatially calibrated (Backus et al., 1999). Stimuli similar to the ones in the experiment are shown in Figure 2.
Figure 2
 
Stereograms illustrating the effects of magnification and disparity noise. The disparities in each stereogram specify a sinusoidal corrugation in depth (cross fuse or divergently fuse the stimuli to see the corrugation). The stereograms on the left contain only signal dots (coherence = 1) and the ones on the right contain 40% signal dots (coherence = 0.4). The half images in the top row are the same size; those in the middle row differ by 10%; those in the bottom row differ by 20%.
Figure 2
 
Stereograms illustrating the effects of magnification and disparity noise. The disparities in each stereogram specify a sinusoidal corrugation in depth (cross fuse or divergently fuse the stimuli to see the corrugation). The stereograms on the left contain only signal dots (coherence = 1) and the ones on the right contain 40% signal dots (coherence = 0.4). The half images in the top row are the same size; those in the middle row differ by 10%; those in the bottom row differ by 20%.
 
Stimuli and procedure
The stimuli were sparse random-dot stereograms. The dots in each eye's stimulus were randomly displaced from a hexagonal lattice (6.4 dots/deg 2). Horizontal disparities were then created by shifting the dots horizontally in opposite directions in each eye's stimulus. The disparities specified a sinusoidal corrugation in depth with a spatial frequency at screen center of 0.4 cycles/deg and peak-to-trough amplitude of 20.4 arcmin. The relative phase of the corrugation waveform was randomized. The corrugation was −10 or +10° from horizontal and the observer's task was to identify which of the two orientations appeared after each stimulus presentation. The stimulus area was circular with a diameter of 10°. A fixation cross was always present to help the observer maintain appropriate binocular eye alignment. Dot size was random from 1.6 to 3.3 arcmin to minimize monocular cues to stimulus orientation. 
We fixed the spatial frequency and amplitude of the corrugation waveform so that the waveform's images in the two eyes did not change. To measure the effect of image-size differences, we added disparity noise and determined the maximum amount of noise for which the observer could reliably make the orientation discrimination. The noise dots replaced signal dots and were assigned disparities from a random uniform distribution with the same disparity range as the dots in the corrugation waveform. The total number of dots (signal plus noise dots) was constant. Thus, we measured coherence thresholds where coherence is the number of signal dots divided by the total number of dots. 
We created differences in image size by uniformly magnifying one eye's image and minifying the other's. The magnification and minification closely mimicked those that occur with natural gaze-normal surfaces at a cyclopean distance of 39 cm and at various azimuths. The relative magnifications were 0, 5, 10, 12.5, and 15%; for an observer with an inter-pupillary distance of 6 cm, those correspond respectively to azimuths of 0, 14.2, 28.7, 36.3, and 44.5°. Altogether we presented nine relative image magnifications (four simulating leftward azimuths, four simulating rightward, and one simulating straight ahead). Observer KXY was presented additional magnifications of 17.5 and 20%. 
All of the relative magnifications were presented with the eyes in three different positions: horizontal versions of −16°, 0°, and +16° (where positive values are rightward). For each version, the haploscope arms were rotated to the appropriate position so that accurate fixation of the fixation cross put the eyes in the desired position. Two observers (AAA, DMH) could not do the task at a version of +16° because in that situation the haploscope mirrors hit their noses; instead they did the task at +10°. 
After each 600-ms stimulus presentation, observers indicated the orientation of the corrugation with a button press. Because the relative phase of the sinusoidal corrugation was randomized, they had to perceive at least part of the waveform to do the task. Each response led to the presentation of the next stimulus. No feedback was provided. 
Interleaved QUEST procedures (Watson & Pelli, 1983) sought the coherence threshold: the proportion of signal dots for which observers could identify the corrugation orientation correctly 75% of the time. The data were collected with multiple staircases for each combination of image magnification and version. A cumulative Gaussian was fit to the data for each condition (at least 160 trials) using a maximum-likelihood criterion (Wichmann & Hill, 2001). Threshold was defined as the mean of the fitted function. 
Results
Figure 3 plots stereo coherence threshold (the number of signal dots divided by total dots) as a function of relative image magnification with the eyes in forward gaze (version = 0°). Each panel shows the data from a different observer. Coherence thresholds were lowest when image size was equal in the two eyes and grew as the size difference increased. Thus, as we expected, stereo performance worsened as the difference in the sizes of the two eyes' images increased. For most observers, magnifications greater than 10% caused a noticeable decline.
Figure 3
 
Results and predictions for Experiment 1. Coherence threshold (the proportion of signal dots in the stimulus at discrimination threshold) is plotted as a function of the relative magnification of the two eyes' retinal images. Each panel shows data for one observer with the eyes in forward gaze (version = 0°). The data are represented by the black symbols and lines. Error bars are standard errors of the means calculated by bootstrapping. The predictions of the neural compensation hypothesis are represented by the dashed red (leftward gaze; version = −16°) and blue (rightward gaze; version = +16°) curves. The predicted shifts differ across observers because the shifts depend on the inter-pupillary distance. For observers AAA and DMH, rightward gaze position was set to +10°.
Figure 3
 
Results and predictions for Experiment 1. Coherence threshold (the proportion of signal dots in the stimulus at discrimination threshold) is plotted as a function of the relative magnification of the two eyes' retinal images. Each panel shows data for one observer with the eyes in forward gaze (version = 0°). The data are represented by the black symbols and lines. Error bars are standard errors of the means calculated by bootstrapping. The predictions of the neural compensation hypothesis are represented by the dashed red (leftward gaze; version = −16°) and blue (rightward gaze; version = +16°) curves. The predicted shifts differ across observers because the shifts depend on the inter-pupillary distance. For observers AAA and DMH, rightward gaze position was set to +10°.
 
If Ogle's neural compensation mechanism exists early in visual processing, changing eye position ought to change the magnification for which stereo performance is best: leftward gaze, for example, should cause a relative “psychological” magnification of the representation of the right-eye's image, so performance should be better for a left-eye magnification greater than 0%. The dashed lines in Figure 3 represent the predictions of this hypothesis (red for leftward, blue for rightward gaze). 
Figure 4 plots coherence threshold as a function of relative image magnification when the eyes were in leftward, forward, and rightward gaze. Clearly, the data were unaffected by changes in eye position. Said another way, threshold was determined by the properties of the retinal images and not by gaze direction. These data are compelling evidence against the presence of a neural compensation mechanism triggered by changes in eye position. It remains possible, of course, that Ogle's hypothesized neural magnification occurs after the stage of disparity estimation and thereby is involved in disparity interpretation.
Figure 4
 
Results of Experiment 1. Coherence threshold is plotted against the relative magnification of the two eyes' images. Each panel shows the data for one observer. The red, black, and blue symbols represent the data with the eyes in leftward (version = −16°), forward (0°), and rightward (+16°) gaze, respectively. Error bars are standard errors. The forward-gaze data are replotted from Figure 3.
Figure 4
 
Results of Experiment 1. Coherence threshold is plotted against the relative magnification of the two eyes' images. Each panel shows the data for one observer. The red, black, and blue symbols represent the data with the eyes in leftward (version = −16°), forward (0°), and rightward (+16°) gaze, respectively. Error bars are standard errors. The forward-gaze data are replotted from Figure 3.
 
There were some small, but systematic asymmetries in the data. Observers BNV and AAA had lower thresholds when the right-eye's image was larger than the left-eye's and observer DMH exhibited the opposite effect. These asymmetries may have been caused by uncorrected aniseikonia. 
Experiment 2
The results of Experiment 1 are clearly inconsistent with Ogle's hypothesis that changes in eye position (specifically, changes in horizontal version) trigger alterations in the represented sizes of the two-eyes' images before the stage of disparity estimation. However, the technique we used to generate disparities in Experiment 1 creates a combination of horizontal and vertical disparities that cannot occur in natural viewing (Held & Banks, 2008). We were concerned that the compensation mechanism might not have been triggered because the disparities in Experiment 1 were unnatural. To rule out this possibility, we re-ran the experiment with geometrically correct horizontal and vertical disparities. 
Methods
The apparatus, stimuli, and procedure were the same as those in Experiment 1 with a few exceptions. Four observers, ages 27–31 years, participated. They had normal or corrected-to-normal (HRF wore contact lenses) visual acuity and stereoacuity. Two were unaware of the experimental hypotheses and two were authors (HRF, BNV). BNV also participated in Experiment 1; the other three did not. The apparatus was the same as in Experiment 1 except for the displays. The new CRTs were Clinton monochrome displays (1280 × 1024 pixels, pixel size & 2.5 arcmin, refresh rate = 75 Hz). 
The most significant difference was that the stereograms were created by using perspective projection. We constructed a virtual sinusoidal corrugation in space, textured it with random dots, and projected the dots to the two eyes. To minimize monocular artifacts, the dot density in virtual space was adjusted for the changing depth of the sinusoid; thus the monocular dot density was nearly constant. This created a stereogram with vertical disparities that occur in natural viewing for the various combinations of simulated distance and azimuth. Dot size was random from 2.5 to 5.0 arcmin (calculated at screen center). Image-size differences were created by changing the azimuth of the virtual surface relative to the head before the perspective projection. The surface remained gaze-normal at 39 cm from the cyclopean eye. At non-zero azimuths, the virtual surface was closer to one eye than the other, so the sizes of the images on the two CRTs differed. The image-size differences and version angles were the same as those in Experiment 1. With observer HGM we could not rotate the haploscope arms to −16 and +16° because his nose hit the mirrors; instead we presented versions of −10 and +10°. 
The method of generating disparities in Experiment 2 caused small variations in dot density in the half-images that might have allowed observers to identify stimulus orientation monocularly. We conducted a control experiment to check this possibility. We presented half-images to the left eyes of observers BNV, MCV, and HGM. The images were those associated with 0% relative magnification and the smaller image for 15% relative magnification from Experiment 2. The latter was chosen because dot densities were highest in that image and therefore most likely to lead to monocular artifacts. The images contained 100% signal dots. Binomial tests showed that none of the observers could reliably discriminate corrugation orientation in any of these monocular conditions (100 trials per condition, p > 0.37 in all conditions and all subjects), which proves that stereopsis was required to perform the task. 
Results and discussion
Figure 5 shows the results of Experiment 2. Red, black, and blue represent the data for leftward, forward, and rightward gaze, respectively. The results are essentially identical to those from Experiment 1: there was again no measurable effect of eye position; rather the worsening of coherence threshold was determined solely by the differences in image size at the two eyes. Thus, we again observed no behavior consistent with eye-position changes triggering neural magnification prior to the stage of disparity estimation.
Figure 5
 
Results of Experiment 2. Coherence threshold is plotted against the relative magnification of the two eyes' images. Each panel shows the data for one observer. The red, black, and blue symbols represent the data with the eyes in leftward (−16°), forward (0°), and rightward (+16°) gaze, respectively. Error bars are standard errors.
Figure 5
 
Results of Experiment 2. Coherence threshold is plotted against the relative magnification of the two eyes' images. Each panel shows the data for one observer. The red, black, and blue symbols represent the data with the eyes in leftward (−16°), forward (0°), and rightward (+16°) gaze, respectively. Error bars are standard errors.
 
Implications for everyday vision
As the ratio of image sizes at the two eyes deviated from 1, we found that stereo performance worsened, particularly for size differences of 15% or more ( Figures 4 and 5). This effect was determined only by the ratio of retinal-image sizes and not by eye position. What are the implications for everyday vision? 
In Experiments 1 and 2, we measured stereo coherence thresholds as a function of the image-size ratio. To our knowledge, the quantitative relationship between coherence threshold and standard measures of stereo sensitivity, such as stereo acuity, has not been determined. However, Highman (1977) showed that the logarithm of the stereo acuity threshold (i.e., the smallest discriminable disparity) is roughly proportional to the image-size ratio. We can infer, therefore, that there would be a similar relationship between coherence threshold and stereo acuity: higher coherence thresholds being associated with poorer stereo acuities. 
Figure 6 shows the coherence thresholds measured in Experiment 1 for each naturally occurring image-size ratio plotted as a function of the head-centric position of gaze-normal surface patches. The stimuli used to measure these coherence thresholds were presented foveally, so the figure shows how threshold would vary as a function of head-centric position for fixated stimuli. As you can see, stereo coherence threshold hovers between 0.3 and 0.4 for positions that are straight ahead or more than 40 cm from the observer. In these cases, one would expect little degradation in stereo performance in natural viewing situations. Threshold is noticeably higher for nearer and more eccentric object positions, so one would expect some performance degradation in those cases. We hasten to point out, however, that observers rarely maintain fixation on very eccentric targets; rather, they shift fixation to such targets by a combination of eye and head movements (Gresty, 1974), and gaze eccentricity relative to the head is hence usually within 20° of straight ahead (Stahl, 1999). It is thus fairly unlikely that normal observers would adopt and hold eye positions that are associated with noticeable diminutions in stereo performance. Indeed the potential degradation in stereo vision could be one of the reasons that gaze is rarely maintained in eccentric positions relative to the head. We conclude that the limits imposed by estimating disparity via local cross-correlation are minimal for normal observers.
Figure 6
 
Plan view of coherence thresholds for naturally occurring image-size ratios. As in Figure 1, the simulated stimuli are small gaze-normal patches. From Figure 1, we know the image-size ratio for each head-centric position. From those ratios, we used the stereo coherence threshold data in Figure 4, averaged across observers, to plot coherence threshold for each object position. The thresholds are indicated by the colors.
Figure 6
 
Plan view of coherence thresholds for naturally occurring image-size ratios. As in Figure 1, the simulated stimuli are small gaze-normal patches. From Figure 1, we know the image-size ratio for each head-centric position. From those ratios, we used the stereo coherence threshold data in Figure 4, averaged across observers, to plot coherence threshold for each object position. The thresholds are indicated by the colors.
 
Modeling
In Experiments 1 and 2, we found that eye position does not affect the relationship between image-size differences and stereo performance. We next investigated the cause of the drop-off in performance as size differences increased. Our hypothesis is that the drop-off is a consequence of using local cross-correlation to estimate disparity, so we presented the same stimuli and task to such a model and then compared the model's performance to human performance. 
Methods
Stimuli and task
The stimuli were the same random-dot stereograms specifying sinusoidal corrugations as in Experiment 1, and the task was again to determine whether the corrugation orientation was −10 or +10°. To make the inputs to the model similar to those in humans, we blurred the half-images according to the optics of the well-focused human eye (Geisler & Davila, 1985) before the images were presented to the cross-correlator. See Filippini and Banks (2009) for further details. The model does not incorporate eye-position signals: its only inputs are the post-optics retinal images. 
Cross-correlator
The cross-correlation was computed for windowed patches of the two half-images:  
c ( δ x , δ y ) = ( x , y ) W L [ ( L ( x , y ) μ L ) ( R ( x δ x , y δ y ) μ R ) ] ( x , y ) W L ( L ( x , y ) μ L ) 2 ( x , y ) W R ( R ( x δ x , y δ y ) μ R ) 2
(3)
where L( x, y) and R( x, y) are the image intensities in the left and right half-images, W L and W R are the windows applied to the half-images (2D isotropic Gaussians), and μ L and μ R are the mean intensities within the two windowed images. Because uniform magnification alters both horizontal and vertical disparities, we needed to estimate disparities in two dimensions. Thus, δ x is the horizontal displacement of W R relative to W L, and δ y is the vertical displacement (where displacement is disparity). 
To estimate disparities, we shifted W L along two straight trajectories, −10 and +10° from horizontal: trajectories that are parallel to one of the two possible stimulus orientations. For each position of W L, we computed the correlation between the left- and right-eye samples for different positions of W R. To minimize computation time, the trajectory of W L was restricted to one line for −10° and another line for +10°; this simplification did not affect the pattern of results, but it did increase coherence threshold uniformly. 
W R was shifted both horizontally and vertically with respect to W L. Figure 7 shows example half-images and the associated cross-correlation output. The x-axis represents the position of W L along its trajectory. The y- and z-axes represent respectively the relative horizontal and vertical displacement of W R relative to W L (corresponding to horizontal and vertical disparities, respectively). 10.1167/9.2.17.M1
Figure 7
 
Movie depicting output of local cross-correlator. The x-axis of each frame is the position of the sample of the left-eye's image, windowed by W L, along the trajectory through that image. The y-axis of each frame is the horizontal offset between W L and W R, which corresponds to the horizontal disparity. The z-axis, which is represented over time, is the vertical offset between W L and W R. Color represents the correlation, red indicating a correlation of 1. The stimulus presented to the cross-correlator is a random-dot stereogram depicting a sinusoidal corrugation; the left-eye's half image is magnified by 5%. One can see the disparity estimates corresponding to the corrugation as the correlator moves from one vertical offset to the next. The correlations are highest when the vertical offset is near 0.
Figure 7
 
Movie depicting output of local cross-correlator. The x-axis of each frame is the position of the sample of the left-eye's image, windowed by W L, along the trajectory through that image. The y-axis of each frame is the horizontal offset between W L and W R, which corresponds to the horizontal disparity. The z-axis, which is represented over time, is the vertical offset between W L and W R. Color represents the correlation, red indicating a correlation of 1. The stimulus presented to the cross-correlator is a random-dot stereogram depicting a sinusoidal corrugation; the left-eye's half image is magnified by 5%. One can see the disparity estimates corresponding to the corrugation as the correlator moves from one vertical offset to the next. The correlations are highest when the vertical offset is near 0.
 
Decision rule
A decision rule was needed for the model to perform the same task as the humans. As in Filippini and Banks (2009), we used template matching. We constructed templates of the post-optics stimuli for both corrugation orientations; the templates had the appropriate spatial frequency and amplitude for the corrugation waveform. We then cross-correlated the templates with the output of the cross-correlator, and the model picked the template yielding the highest correlation on every trial. From these responses, we generated psychometric functions: percent correct vs. coherence. Coherence thresholds were then calculated from maximum-likelihood fits of cumulative Gaussians to the psychometric data (Wichmann & Hill, 2001). 
Results
Figure 8 plots coherence threshold against relative image magnification. As relative magnification increased, threshold rose, creating a bowl-shaped function centered at 0%. Thus, the cross-correlation model exhibits the same general behavior as humans did in Experiment 1.
Figure 8
 
Coherence threshold of the cross-correlation model as a function of relative image magnification in the two eyes' images. Coherence threshold is the proportion of signal dots in the stimulus at discrimination threshold. Magnification is the size difference between the two eyes' images expressed as a percentage. To the left of 0%, the left-eye's image is larger. The extent of the vertical search was constant at 3.6 arcmin. Red, blue, and green represent window sizes (the standard deviations of W L and W R) of 6, 18, and 30 arcmin, respectively. Error bars represent standard errors.
Figure 8
 
Coherence threshold of the cross-correlation model as a function of relative image magnification in the two eyes' images. Coherence threshold is the proportion of signal dots in the stimulus at discrimination threshold. Magnification is the size difference between the two eyes' images expressed as a percentage. To the left of 0%, the left-eye's image is larger. The extent of the vertical search was constant at 3.6 arcmin. Red, blue, and green represent window sizes (the standard deviations of W L and W R) of 6, 18, and 30 arcmin, respectively. Error bars represent standard errors.
 
We investigated how the choice of window size affected the modeling results because window size is an important parameter of local cross-correlation models (Banks et al., 2004; Filippini & Banks, 2009; Harris et al., 1997; Kanade & Okutomi, 1994). Larger windows contain more luminance information, so they allow better disparity estimation when disparity varies smoothly across position; larger windows, however, reduce the ability to detect spatially fine variations (Banks et al., 2004; Filippini & Banks, 2009). Thus, there is frequently an optimal window size for the disparity estimation task at hand. We varied the standard deviation of WL and WR from 6 to 30 arcmin to investigate how window size affects behavior in the task of Experiment 1. We chose 6 arcmin because there is evidence that that size corresponds to the smallest used by the human visual system (Filippini & Banks, 2009; Harris et al., 1997). We chose 30 arcmin because that size is still small enough to not encroach on the Nyquist sampling limit given the spatial frequency of the corrugation waveform. Figure 8 shows the results. There is little, if any, effect of window size on how relative magnification affects disparity estimation, so window size had little effect on the model's behavior in this task. 
Uniform magnification alters both horizontal and vertical disparities, so to find the highest correlation between the two images, we had to vary the horizontal and vertical position of the right-eye's sample relative to the left-eye's sample. The maximum disparity for binocular fusion is much smaller for vertical than for horizontal disparities: for horizontal disparity, the fusion limit is 10–20 arcmin at the fovea; for vertical disparity, the limit is only 2–4 arcmin (Duwaer & van den Brink, 1981). Thus, to constrain the model in the way human vision is constrained, we restricted the extent of vertical search to 3.6 arcmin. 
Figure 9 shows the average human thresholds from Experiment 1 along with the model's thresholds when its parameters were set to the most reasonable values. As the difference in image size grew, coherence threshold rose in similar fashion for humans and the model. The model's thresholds were generally higher than human thresholds, but this is due to the fact that the model was restricted to searching along one trajectory for the left-eye's image; when we relaxed this restriction, the model's thresholds became more similar to human thresholds, but the computation time increased significantly. Because the local cross-correlator and humans exhibit similar effects of relative image magnification, we conclude that the degradation of human performance when the two eyes' images differ in size is a byproduct of using correlation to estimate disparity.
Figure 9
 
Comparison of model and human performance. Coherence threshold is plotted as a function of relative magnification. Filled circles and solid lines represent the average human data for forward gaze (version = 0°) in Experiment 1. Unfilled squares and dashed lines represent model data; the standard deviation of the window was 18 arcmin and the extent of the vertical search was 3.6 arcmin. Error bars represent standard errors.
Figure 9
 
Comparison of model and human performance. Coherence threshold is plotted as a function of relative magnification. Filled circles and solid lines represent the average human data for forward gaze (version = 0°) in Experiment 1. Unfilled squares and dashed lines represent model data; the standard deviation of the window was 18 arcmin and the extent of the vertical search was 3.6 arcmin. Error bars represent standard errors.
 
Experiment 3
Roughly one in five people in the industrialized world have different refractive errors in the two eyes, a condition called anisometropia (Borish, 1970). The majority of these people has differences in the sizes of the images at their two retinas even when the distances from the left and right eyes to the object are the same (Bradley, Rabin, & Freeman, 1983; Highman, 1977). We showed in Experiments 1 and 2 that normal observers do not adjust for changes in image size caused by eccentric and near gaze. Figure 10 shows in the same format as Figure 6 how stereo coherence threshold would vary as a function of stimulus position for people with 10% and 20% of optical aniseikonia if they were affected by differences in retinal-image size in the same fashion as our normal observers. Red indicates the positions for which one would expect a degradation in stereopsis. As you can see, a 10% or 20% difference in image size (which is commonplace in people with more than 2 diopters of anisometropia) would be expected to cause noticeable degradations in stereo performance for most object positions in front of the head. Do such people indeed suffer losses in stereo vision for natural viewing positions? Or have they adapted to their optical aniseikonia in a way that minimizes losses in stereo vision?
Figure 10
 
Plan view of coherence thresholds for naturally occurring size ratios in people with aniseikonia. The upper and lower panels show the calculations for 10% and 20% aniseikonia (larger image in right eye), respectively. The colors indicate the stereo coherence thresholds that would be expected if no adaptation to aniseikonia occurred.
Figure 10
 
Plan view of coherence thresholds for naturally occurring size ratios in people with aniseikonia. The upper and lower panels show the calculations for 10% and 20% aniseikonia (larger image in right eye), respectively. The colors indicate the stereo coherence thresholds that would be expected if no adaptation to aniseikonia occurred.
 
We attempted to answer those questions in Experiment 3. We tested a person with significant optical aniseikonia to determine if she had adapted or not. We did so by measuring coherence thresholds as a function of the relative sizes of the images presented to the eyes. The measurements were done with the eyes in forward gaze. If the observer had compensated for her long-standing optical aniseikonia, we would expect stereo performance to be best when the retinal images were the same sizes. If she had not compensated, we would expect the best performance when the image-size difference was appropriate to nullify the optical aniseikonia. 
Methods
A 24-year-old observer participated. She had spherical equivalent refractive errors of +3.50 D and +6.75 D in the left and right eyes, respectively. She was first diagnosed with anisometropia at age 9 years; the magnitude of anisometropia increased steadily up to age 20 years, and has since remained constant. Axial eye lengths (anterior surface of the cornea to the retina) as well as corneal curvatures were measured by optical biometry (Zeiss IOLMaster). Her right eye was significantly longer than her left eye: Axial lengths were 25.77 and 26.85 mm in the left and right eyes, respectively. Corneal curvatures are similar in the two eyes and consistent with a small astigmatism in the right eye: 7.80 (K1) and 7.74 mm (K2) in the right eye and 7.91 (K1) and 7.60 mm (K2) in the left eye. The similarity of the corneal curvatures and differences in axial lengths are consistent with the magnitude and sign of the anisometropia, so we conclude that the observer is an axial anisometrope (that is, the difference in refractive error is caused solely by the difference in eye lengths). The observer reported that she wears her spectacles most of the time. 
We calculated retinal-image sizes in mm using the reduced eye model. With no spectacle or contact-lens correction, the retinal images in the right eye are 5.40% larger than in the left eye. With her spectacle correction in place, we calculated a relative magnification in the right eye of 1.37%; the decrease in optical aniseikonia due to the spectacles is a byproduct of Knapp's Law (Knapp, 1869). With her contact lenses in place, the right eye's relative magnification is only slightly smaller than with no correction: 4.61%. The observer wore her contact-lens correction during the experiment so that she could see the stimuli sharply in both eyes. We took the magnifications caused by the contacts into account in interpreting the results. 
The apparatus, stimuli, and procedure were identical to those of Experiment 1 with two exceptions. First, the experimental measurements were made in forward gaze only (version = 0 deg). Second, we presented the image magnifications in smaller steps in the hope of determining precisely the image-size ratio for which coherence threshold was lowest. 
Results
Figure 11 plots coherence threshold as a function of the ratio of image sizes presented to the eyes. We fit the data with:  
t = a + b ( S m ) + c ( S m ) 2
(4)
where t is coherence threshold, S is percentage of image magnification (negative percentages represent larger images in the left eye than in the right eye), and a, b, c, and m are fitting parameters. To do the fitting, we weighted each data point by the inverse of its squared standard error and used a least-squares criterion on those weighted values. We were most interested in the value of m for the best-fitting function because it represents the magnification for which coherence threshold was lowest. Taking into account the fact that she wore her contact lenses while being tested, the expected m would be 0.79% if she were adapted to her aniseikonia when she is not wearing any optical correction (purple dashed line in the figure), 0% if she were adapted when she is wearing her contact lenses (blue dashed line), −3.24% if she were adapted to her spectacles (red dashed line), and −4.61% if she were not adapted at all (green dashed line). The value of m for the best-fitting function was −0.30% (gray line). With bootstrapping, we determined that the 95% confidence intervals for m are −1.37 to 0.94%. The predictions for adaptation to contacts and no optical correction fall well within the confidence intervals, while the predictions for no adaptation and for adaptation to her spectacles do not fall within the intervals. Thus, the data show that she has adapted to her optical aniseikonia, but we cannot determine whether she has adapted to no optical correction or to her contact-lens correction.
Figure 11
 
Results and predictions for Experiment 3. Coherence threshold (the proportion of signal dots in the stimulus at discrimination threshold) is plotted as a function of the relative magnification of the two eyes' retinal images. The observer is an anisometrope and was tested with the eyes in forward gaze (version = 0°). The data are represented by the black symbols and lines. Error bars are standard errors of the means calculated by bootstrapping. We determined the relative magnification that resulted in the best stereo performance by fitting the data with Equation 4. This magnification is represented by the gray solid line. The dashed lines indicate the magnifications for which stereo performance would be best if the observer were completely adapted to image-size differences with glasses (red dashed line), contact lenses (blue line), or no optical correction (purple line). If she were not adapted at all, performance would be best at the green dashed line.
Figure 11
 
Results and predictions for Experiment 3. Coherence threshold (the proportion of signal dots in the stimulus at discrimination threshold) is plotted as a function of the relative magnification of the two eyes' retinal images. The observer is an anisometrope and was tested with the eyes in forward gaze (version = 0°). The data are represented by the black symbols and lines. Error bars are standard errors of the means calculated by bootstrapping. We determined the relative magnification that resulted in the best stereo performance by fitting the data with Equation 4. This magnification is represented by the gray solid line. The dashed lines indicate the magnifications for which stereo performance would be best if the observer were completely adapted to image-size differences with glasses (red dashed line), contact lenses (blue line), or no optical correction (purple line). If she were not adapted at all, performance would be best at the green dashed line.
 
There are two plausible mechanisms by which people with long-standing optical aniseikonia may adapt such that they perceive no difference in image size and such that they achieve best stereo performance when the retinal images are the same size. First, they may adapt by neurally adjusting the relative sizes of the represented retinal images (perhaps by the “psychological” adjustment suggested by Ogle (1950) for eccentric gaze). Second, adaptation might be a byproduct of retinal stretching that accompanies an increase in axial length (Chui, Song, & Burns, 2008). If the retina stretched by an amount commensurate with the length increase, the number of photoreceptors stimulated by an object in the world would be constant in the two eyes despite differences in image size expressed in linear units (Bradley et al., 1983; Winn et al., 1988). Our results do not allow us to distinguish between these two hypotheses; they only reveal that an observer with significant optical aniseikonia has adapted for the purpose of estimating disparity and performing a stereoscopic task. 
Conclusion
When one fixates near, eccentric objects, the retinal images in the two eyes differ in size. We examined the consequences of such size differences for stereo performance. We found that stereo vision is best when the sizes of the images presented to the eyes are equal or nearly equal, and that remains true even when the eyes are in eccentric gaze where a size difference would naturally occur. A model of disparity estimation based on correlating the two eyes' images exhibits the same behavior. We also tested an observer who has persistently different image sizes due to differing refractive errors in the two eyes. The observer's stereo vision was best when the images presented to the eyes were the same size, which means that complete compensation has occurred for the long-standing difference in the retinal images. In summary, stereopsis is best for stimuli that are straight ahead of the viewer and therefore have equal or nearly equal sizes as they approach the eyes. 
Acknowledgments
The authors thank Chris Burns for software assistance, David Hoffman for software assistance and helpful discussion, Ralph Bartholomew and Nick Lines for apparatus construction, Rob Meyerson for help with the data collection, Ethan Rossi for measuring the axial eye length, and Tracy Huang for being an observer. This research was funded by NIH research grant EY-R01-08266 to MSB and by NWO Rubicon fellowship 446-06-021 to BV. 
Commercial relationships: none. 
Corresponding author: Martin Banks. 
Email: martybanks@berkeley.edu. 
Address: 360 Minor Hall, Berkeley, CA 94720-2020, USA. 
References
Ames, A. Ogle, K. N. Gliddon, G. H. (1932). Corresponding retinal points, the horopter and size and shape of ocular images II. Journal of the Optical Society of America A, 22, 575–631. [CrossRef]
Backus, B. T. Banks, M. S. van Ee, R. Crowell, J. A. (1999). Horizontal and vertical disparity, eye position, and stereoscopic slant perception. Vision Research, 39, 1143–1170. [PubMed] [CrossRef] [PubMed]
Banks, M. S. Gepshtein, S. Landy, M. S. (2004). Why is spatial stereoresolution so low? Journal of Neuroscience, 24, 2077–2089. [PubMed] [Article] [CrossRef] [PubMed]
Blakemore, C. (1970). A new kind of stereoscopic vision. Vision Research, 10, 1181–1199. [PubMed] [CrossRef] [PubMed]
Borish, I. M. (1970). Clinical refraction. New York: The Professional Press.
Bradley, A. Rabin, J. Freeman, R. D. (1983). Non-optical determinants of aniseikonia. Investigative Ophthalmology & Visual Science, 24, 507–512. [PubMed] [Article]
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. [PubMed]
Burt, P. Julesz, B. (1980). A disparity gradient limit for binocular fusion. Science, 208, 615–617. [PubMed]
Chui, T. Y. Song, H. Burns, S. A. (2008). Individual variations in human cone photoreceptor packing density: Variations with refractive error. Investigative Ophthalmology & Visual Science, 49, 4679–4687. [PubMed]
Cormack, L. K. Stevenson, S. B. Schor, C. M. (1991). Interocular correlation, luminance contrast and cyclopean processing. Vision Research, 31, 2195–2207. [PubMed]
Cumming, B. G. DeAngelis, G. C. (2001). The physiology of stereopsis. Annual Review of Neuroscience, 24, 203–238. [PubMed]
Duwaer, A. L. van den Brink, G. (1981). What is the diplopia threshold? Perception & Psychophysics, 29, 295–309. [PubMed]
Filippini, H. R. Banks, M. S. (2009). Limits of stereopsis explained by local cross-correlation. Journal of Vision, 9, (1):8, 1–18, http://journalofvision.org/9/1/8/, doi:10.1167/9.1.8. [PubMed] [Article]
Fiorentini, A. Maffei, L. (1971). Binocular depth perception without geometrical cues. Vision Research, 11, 1299–1305. [PubMed]
Fleet, D. J. Wagner, H. Heeger, D. J. (1996). Neural encoding of binocular disparity: Energy models, position shifts and phase shifts. Vision Research, 36, 1839–1857. [PubMed]
Geisler, W. S. Davila, K. D. (1985). Ideal discriminators in spatial vision: Two-point stimuli. Journal of the Optical Society of America A, Optics and Image Science, 2, 1483–1497. [PubMed]
Gresty, M. A. (1974). Coordination of head and eye movements to fixate continuous and intermittent targets. Vision Research, 14, 395–403. [PubMed]
Harris, J. M. McKee, S. P. Smallman, H. S. (1997). Fine-scale processing in human binocular stereopsis. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 14, 1673–1683. [PubMed]
Held, R. T. Banks, M. S. (2008). Misperceptions in stereoscopic displays: A vision science perspective. Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization, 23–32).
Herzau, W. Ogle, K. N. (1937). Über den Größenunterschied der Bilder beider Augen bei asymmetrischer Konvergenz und seine Bedeutung für das zweiäugige Sehen. Archiv für Ophthalmologie, 137, 327–363.
Highman, V. N. (1977). Stereopsis and aniseikonia in uniocular aphakia. British Journal of Ophthalmology, 61, 30–33. [PubMed] [Article]
Hillis, J. M. Banks, M. S. (2001). Are corresponding points fixed? Vision Research, 41, 2457–2473. [PubMed]
Hillis, J. M. Watt, S. J. Landy, M. S. Banks, M. S. (2004). Slant from texture and disparity cues: Optimal cue combination. Journal of Vision, 4, (12):1, 967–992, http://journalofvision.org/4/12/1/, doi:10.1167/4.12.1. [PubMed] [Article] [CrossRef]
Jiménez, J. R. Ponce, A. del Barco, L. J. Díaz, J. A. Pérez-Ocón, F. (2002). Impact of induced aniseikonia on stereopsis with random-dot stereogram. Optometry and Vision Science, 79, 121–125. [PubMed] [CrossRef]
Kanade, T. Okutomi, M. (1994). A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 920–932. [CrossRef]
Knapp, H. (1869). The influence of spectacles on the optical constants and visual acuteness of the eye. Archives of Ophthalmology and Otology, 1, 377.
Lovasik, J. V. Szymkiw, M. (1985). Effects of aniseikonia, anisometropia, accommodation, retinal illuminance, and pupil size on stereopsis. Investigative Ophthalmology & Visual Science, 26, 741–750. [PubMed] [Article]
Ogle, K. N. (1939). Relative sizes of ocular images of the two eyes in asymmetric convergence. Archives of Ophthalmology, 22, 1046–1067.
Ogle, K. N. (1950). Researches in binocular vision. Philadelphia, London: W B Saunders.
Ohzawa, I. (1998). Mechanisms of stereoscopic vision: The disparity energy model. Current Opinion in Neurobiology, 8, 509–515. [PubMed]
Ohzawa, I. DeAngelis, G. C. Freeman, R. D. (1990). Stereoscopic depth discrimination in the visual cortex: Neurons ideally suited as disparity detectors. Science, 249, 1037–1041. [PubMed]
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed]
Potetz, B. Lee, T. S. (2003). Statistical correlations between two-dimensional images and three-dimensional structures in natural scenes. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 20, 1292–1303. [PubMed]
Schor, C. M. Maxwell, J. S. Stevenson, S. B. (1994). Isovergence surfaces: The conjugacy of vertical eye movements in tertiary positions of gaze. Ophthalmic & Physiological Optics, 14, 279–286. [PubMed]
Stahl, J. S. (1999). Amplitude of human head movements associated with horizontal saccades. Experimental Brain Research, 126, 41–54. [PubMed] [Article]
Watson, A. B. Pelli, D. G. (1983). QUEST: A Bayesian adaptive psychometric method. Perception & Psychophysics, 33, 113–120. [PubMed]
Wichmann, F. A. Hill, N. J. (2001). The psychometric function: I Fitting, sampling, and goodness of fit. Perception & Psychophysics, 63, 1293–1313. [PubMed]
Winn, B. Ackerley, R. G. Brown, C. A. Murray, F. K. Prais, J. St. John, M. F. (1988). Reduced aniseikonia in axial anisometropia with contact lens correction. Ophthalmic & Physiological Optics, 8, 341–344. [PubMed]
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×