The cue-combination model can generate any desired PSE by suitable variation of the cue weight. The cue-switching model can similarly generate any PSE intermediate between the luminance and depth PSEs by varying the probability of using the depth cue. But the models can be distinguished using the shape of the predicted psychometric functions. In particular, the model parameter dictates not only the PSE but the psychometric function sigma, which we consider next.
As noted in the results section, the addition of a luminance edge sharpened the precision of depth localization, reducing sigma, even when the edges were noncoincident. The cue-combination model, however, predicts that the sigma value when both cues are available will be less than the lower of the two single-cue sigma values (see for instance, Landy & Kojima,
2001), in this case the luminance-only sigma values. Our sigmas in the depth task with conflicting cues did not differ reliably in either direction from the luminance-only sigma values, even if we take the liberty of treating the sigmas for each subject and condition as independent observations. They are, however, significantly greater (
p < 0.01) than predicted by the optimal cue-combination model. And even if we adopt the best-fitting weights (rather than the optimal ones) for cue combination, the sigmas predicted for the noncoincident conditions remain too small, at about 0.8 times the values observed. Thus the cue-combination model may not be strictly consistent with the data in this respect. Others have found that their data only partially matches this prediction of optimal cue combination, e.g., Landy and Kojima (
2001); Rivest and Cavanagh (
1996). Nonetheless, optimal cue combination does fit our data much better than the cue-switching model, where the predicted sigmas are on average 43% too large. The reason for that large predicted sigma is apparent in
Figure 6—in the cue-switching model, the psychometric functions, rather than just the PSEs, are averaged. The interleaving of the mutually noncoincident psychometric functions for luminance and depth cues creates an average function that is shallower than either one.
As
Figure 6 also illustrates, the psychometric functions derived from the two models can differ in shape as well as in position and scale. The overall goodness of fit of the models is best assessed from a comparison of the likelihood of the particular observed sequence of responses over all trials under each of the competing models. Consider first the depth task. If we first optimize each model's free parameter (depth cue weight or depth selection probability) for each subject and stimulus condition, the probability of the data is greater for the cue-combination model in 11 of 12 cases, on average by about a factor of 10. Consequently when all depth task data are considered together, the balance of probability decisively favors cue combination, with a likelihood ratio of 9.4 × 10
10. A related but distinct alternative to the likelihood ratio is the Bayes factor, which expresses the relative likelihood of the two models with no prior constraint on the parameter values and no prior difference in likelihood. This requires integrating the probability of the data over all possible values of the free parameters under each model. The Bayes factor is the ratio of the integrals. The Bayes factor is less extreme than the likelihood ratio for optimized parameter values; this is expected because the models become equivalent in the limit where the respective parameters both approach one or zero. But at 3.8 × 10
8, the Bayes factor still decisively favors the cue-combination model. Because the models become equivalent in the limit of low-depth cue importance, the corresponding results for the luminance task are less decisive, but the Bayes factor of 18.7 for and the likelihood ratio of 2.7 × 10
3 again favor cue combination in both cases.
Neither model is an acceptable fit to any one condition's data trial by trial. This is not unexpected, since the models are heavily idealized. Notably, they assume no variation in response bias across conditions and no fluctuation in observer bias or sensitivity over trials (because they assume binomial variability only in the judgments). Other inexact idealizations in our chosen implementation of the models include Gaussian variability (cumulative normal psychometric functions), 0.015 probability for finger error, etc. Since neither model is strictly acceptable, interpretation of the ostensibly conclusive evidence in favor of the cue combination alternative requires caution. Each model represents a point in a high dimensional space of recognized and unrecognized parametric influences on the data, including those few just mentioned and doubtless many others, along with the one freely varied model parameter. The variation of likelihood with these many unknown variables defines a complex many-dimensional probabilistic landscape with a high peak at ground truth. The cue-combination model has much higher likelihood than the switch model, but both are far below the peak, and they may represent distinct subpeaks. Thus it is entirely possible that the peak could be reached more easily by some modification of the switch model rather than the cue-combination model. But given the working assumptions made in implementing the models, the advantage lies overwhelmingly with cue combination. Other researchers using cue conflict paradigms like ours have also found that cue combination fits the data better than cue switching (e.g., Landy & Kojima,
2001).