**Abstract**:

**Abstract**
**The spatial resolution of disparity perception is poor compared to luminance perception, yet we do not notice that depth edges are more blurry than luminance edges. Is this because the two cues are combined by the visual system? Subjects judged the locations of depth-defined or luminance-defined edges, which were separated by up to 5.6 min of arc. The perceived edge location was a function of the depth-defined edge and the luminance-defined edge, with the luminance edge tending to play a larger role. Our data are compatible with but not completely explained by an optimal cue-combination model that gives more reliable cues a heavier weight. Both edge cues (depth and luminance) contribute to the final percept, with an adaptive weighting depending on the task and the acuity with which each cue is perceived.**

*cue switching*) rather than cue combination. With an intermediate amount of camouflage one or both cues might not be detected, and with two cues there is a greater chance that at least one cue will be detected accurately on any given trial. Thus, acuity could be higher even with absolutely no integration of cues if subjects simply switch to using whichever cue is most visible on any given trial.

^{2}.

*large*and

*small*depth step conditions, respectively. This difference was intended to modulate the detectability of the depth edge.

*coincident*condition), or beyond (by 2.8 or 5.6 arcmin, referred to as the

*noncoincident*conditions). In pilot work, we found that the luminance increment (with the associated slight reduction in contrast) decreased texture visibility slightly, so in order to minimize differences in visibility of the depth discontinuity the luminance step was always above the depth edge so that pattern contrast at the location of the depth edge was unaffected by the step. The luminance step was inset from the vertical edges of the texture by 5.6 arcmin on each side so that it would be somewhat distant from the reference line, thus making it slightly harder to localize.

*depth-edge only*condition (the luminance increment was added to the whole texture), and a

*luminance-edge only*condition (the texture was all at the same depth).

*sigma*, the standard deviation of the cumulative Gaussian. It is also the standard deviation of the implied distribution of the perceived locations of the edge, if we neglect variability in the perceived location of the reference (an assumption we relax below).

*coincident*, where task-relevant and irrelevant edges were at the same location, and

*noncoincident*, where the distance between edges differed by a fixed offset. To determine the effect of the offset task-irrelevant edge in the noncoincident conditions we subtracted the coincident PSEs from the noncoincident PSEs (thus eliminating response bias). A positive value indicates that the task-relevant edge's perceived position is shifted toward the task-irrelevant edge. These values are plotted for our three subjects in Figures 2–4. We manipulated the visibility of the depth edge by changing the magnitude of the step (large and small). Surprisingly, one of our subjects (JM) actually found the small depth edge more detectable. For ease of interpretation, we hereafter adopt the terms weak and strong depth edge, with weak referring to the large depth edge for JM and the small depth edge for the other two subjects.

*p*< 0.05) after multiplying the significance level by 12 to correct for multiple comparisons. The consistent direction of effect (11 out of 12 conditions), however, does suggest this effect is real, though much smaller than the effect of the luminance edge. If we consider the perceived shifts in the 12 cases as independent measures, a two-tailed

*t*test rejects the null hypothesis of zero shift at

*p*< 0.001. This analysis implicitly neglects subject variation as well as variation with offset and depth strength, and these factors were indeed not significant in analysis of variance on the luminance edge judgments.

*noncoincident*luminance edge improved the precision of depth-edge localization in 10 of the 12 cases (3 Subjects × 2 Depth/Luminance Edge Separations × 2 Depth Differences), reducing sigma to 78% of its depth-only value on the average, a statistically significant difference if the sigma estimates are treated as independent observations (

*t*= 2.94, df = 11,

*p*< 0.01). No effects of subject, offset, or depth difference were statistically significant.

*optimal cue-combination model*. If the location estimates implied by each cue are contaminated by independent random errors, it is advantageous to combine the two signals by taking a weighted average that favors the more reliable cue (e.g., Jacobs, 1999; Yuille & Bulthoff, 1996). The weight is set by the error variances of the perceived edge locations signaled by each cue (when shown in isolation). If the weights are inversely proportional to the variances, the lower variance cue gets higher weight. This weighting is optimal in the sense that it minimizes the variance of the resultant estimate when the individual estimates are contaminated by Gaussian noise.

*cue-switching model*, where the subject chooses to attend to one cue or the other whenever both are present. Ideally they would use the task-relevant cue exclusively, but if the availability of the cues fluctuates from trial to trial, they might switch to using the irrelevant cue on some percentage of the trials. Whatever the reason, if they do alternate between cues, the final psychometric function will be an additive mixture of the two functions for the cue in isolation. Since the functions have inflection points at different locations, the mixture function will as well.

*x*= 0 is neither above nor below the test depth edge, but aligned with it. If the subject relied entirely on the luminance signal, the psychometric function would have its inflection point at +5.6 arcmin on the horizontal axis, as shown by the right-hand dashed curve; at that location the reference is aligned with the test-luminance edge. The data show an inflection at an intermediate location, where the reference is 3.7 min of visual angle above the depth edge, hence 1.9 min below the luminance edge. The cue-switching model can approximate this behavior by assuming that the luminance cue is consulted on 61% of trials, and the depth cue on the remaining 39%. The dashed and dotted curves in Figure 6 are scaled accordingly. The open circles show the total fraction of “above” judgments, obtained by summing the dashed and dotted functions, that is by summing over both kinds of trials in the 61:39 ratio shown. The open circles are the cue-switching model's maximum likelihood fit to the data of JM by optimizing the proportion of luminance-based judgments.

*x*= 0 at the PSE measured for the coincident condition, rather than the point of physical alignment.

*t*test yields

*p*< 0.001 if the individual weights are treated as independent observations; in support of this admittedly questionable idealizing assumption of independence, analysis of variance showed no significant main effects or interactions for the subject or experimental condition factors.

*p*< 0.05 for the interaction between subject and depth difference in analysis of variance on the best-fitting weights).

*p*< 0.001). The open circles show an opposite but statistically insignificant tendency, suggesting a greater than optimal weight for the depth cue when judging depth-edge location.

*p*< 0.01) than predicted by the optimal cue-combination model. And even if we adopt the best-fitting weights (rather than the optimal ones) for cue combination, the sigmas predicted for the noncoincident conditions remain too small, at about 0.8 times the values observed. Thus the cue-combination model may not be strictly consistent with the data in this respect. Others have found that their data only partially matches this prediction of optimal cue combination, e.g., Landy and Kojima (2001); Rivest and Cavanagh (1996). Nonetheless, optimal cue combination does fit our data much better than the cue-switching model, where the predicted sigmas are on average 43% too large. The reason for that large predicted sigma is apparent in Figure 6—in the cue-switching model, the psychometric functions, rather than just the PSEs, are averaged. The interleaving of the mutually noncoincident psychometric functions for luminance and depth cues creates an average function that is shallower than either one.

^{10}. A related but distinct alternative to the likelihood ratio is the Bayes factor, which expresses the relative likelihood of the two models with no prior constraint on the parameter values and no prior difference in likelihood. This requires integrating the probability of the data over all possible values of the free parameters under each model. The Bayes factor is the ratio of the integrals. The Bayes factor is less extreme than the likelihood ratio for optimized parameter values; this is expected because the models become equivalent in the limit where the respective parameters both approach one or zero. But at 3.8 × 10

^{8}, the Bayes factor still decisively favors the cue-combination model. Because the models become equivalent in the limit of low-depth cue importance, the corresponding results for the luminance task are less decisive, but the Bayes factor of 18.7 for and the likelihood ratio of 2.7 × 10

^{3}again favor cue combination in both cases.

*p*= 0.046 in an analysis of variance on the differences between best-fitting depth weight and optimal weight). While we did not explicitly ask subjects if the two edges could be perceived as separate, the sigma data from the single-cue conditions suggests that subjects might have been able to tell that the edges were separate at least some of the time in the 5.6 offset conditions. Author and subject AR noted that on some trials it did appear that there might be two edges but could not tell with certainty. It would be interesting to extend the distance between edges and also ask on each trial if one or two edges were perceived to see how the attraction falls off as a function of the distinctness of the two edges.

*Optica Acta**,*24

*,*159–177. [CrossRef]

*Spatial Vision**,*10 (4), 433–436. [CrossRef] [PubMed]

*Journal of the Optical Society of America A**,*5 (10), 1749–1758. [CrossRef]

*Science**,*298

*,*1627–1630. [CrossRef] [PubMed]

*, 39 (21), 3621–3629. [CrossRef] [PubMed]*

*Vision Research**. Manuscript submitted for publication.*

*Journal of Neuroscience**, 36, 14.*

*Perception**, 7 (7): 5, 1–24, http://www.journalofvision.org/content/7/7/5, doi:10.1167/7.7.5. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision*

*Journal of the Optical Society of America A**,*18

*,*2307–2320. [CrossRef]

*, 43 (25), 2649–2657. [CrossRef] [PubMed]*

*Vision Research*

*Perception**,*15 (2), 157–162. [CrossRef] [PubMed]

*Spatial Vision**,*10 (4), 437–442. [CrossRef] [PubMed]

*, 47 (1), 155–166. [CrossRef] [PubMed]*

*Neuron*

*Vision Research**,*36 (1), 53–66. [CrossRef] [PubMed]

*Nature**,*251 (5471), 140–142. [CrossRef] [PubMed]

*, 40 (15), 1955–1967. [CrossRef] [PubMed]*

*Vision Research*

*Vision Research**,*21

*,*1341–1356. [CrossRef] [PubMed]

*. New York, NY: Cambridge University Press.*

*Perception as Bayesian inference*