We used the results of our discrimination experiment to choose a series of suprathreshold contrasts for use in the MLDS experiment, separately for each subject and condition. These were approximately equally spaced perceptually in the range where discrimination showed power-law behavior. Specifically, starting at the transition contrast (ct) defined above, we used the power-law model to estimate the increment threshold at this pedestal level. We added four times this value to ct to obtain the second contrast in the series. Using a new threshold value estimated at this second contrast we obtained a third value in the same way and repeated this until we obtained 10 contrasts or reached the maximum possible contrast. Thus, any two adjacent contrasts were approximately four threshold units apart. The spacing of contrasts is an important design question for MLDS, as it influences how confidently the observer can make the required judgments. At one extreme, very difficult judgments would lead to chance performance across all triplets and provide no information. At the other extreme, if all judgments were trivially easy, the model would not be able to estimate internal variability. To confirm that we fell between these extremes we computed for each triplet the fraction of trials in which an observer reported the second interval as larger. Grouping across all observers and conditions, we found the distribution of this empirical probability (not shown) to be quite flat, with 58% of conditions falling in the bottom and top quartiles (p < .25 or > .75). Thus there was adequate variability overall, with many conditions led to highly consistent responses.
The results of the MLDS fit are shown in
Figure 6 for Observer 1 in the achromatic condition (as in
Figure 2). The free parameters of the model include a scale value for each tested contrast, with the first and last fixed at 0 and 1. An additional parameter,
σ describes the standard deviation of constant-variance Gaussian noise added to each output. For three example contrasts,
Figure 6 shows the modeled distributions of internal responses, and diagrams how these are compared to derive a decision variable estimating which contrast difference is larger.
The model transducer obtained from MDLS in this case had a compressive shape similar to that obtained from the discrimination experiment. We found, however, that the modeled internal noise was larger for MLDS than for discrimination. To compare the two experiments, we scaled the discrimination model to span the same range as the MLDS model. The first transducer predicts discriminability based on the
difference between two responses, relative to the noise level
σ. It is therefore not affected by an additive offset, and can also be scaled arbitrarily given that response amplitude and noise level are scaled identically. We shifted and scaled the transducer curve to have values of 0 and 1 at the lowest and highest contrast tested in the MLDS experiment. The results are shown in
Figure 7 for the same example condition as
Figure 6. After scaling, the discrimination model had a
σ less than half that of the MLDS model. Note that the
σ values we report represent the variability of the transducer output and not that of the decision variable. Given the noise model used here, differencing two response variables leads to a distribution with twice the variance of the inputs. As MLDS relies on a difference of differences, the decision variable has four times this variance (see
Figure 6), or a standard deviation of 2σ. For discrimination (comparing just two stimuli) this would be
\(\sqrt 2 \sigma \). This difference is potentially related to a difference between the tasks, as discussed later, but mathematically it does not affect our fitted
σ values, given the model assumptions.
Transducers derived from both experiments are shown in
Figure 8 for all observers and conditions, in the same format as
Figure 7. Independent of modeled noise level, we also compared the shape of transducers using a “compression index,” defined as the level of the normalized curve at the halfway point between lowest and highest contrast. For a linear curve this would be 0.5; values greater than 0.5 indicate compressive nonlinearity. Where the MLDS experiment used an even number of contrast levels (and so lacked one at the halfway point) we used linear interpolation for this analysis.
Figure 9 compares the sigma values and compression indexes between the two experiments, for each observer and condition.
Modeled internal noise was higher for MLDS than discrimination in all cases. We did not find a systematic difference across stimulus conditions in terms of noise level, transducer shape, or the degree of agreement between the two experiments. An exception is found in the high spatial frequency case, where for two observers the MLDS results show a more linear transducer compared to that from discrimination and compared to other conditions. Observer 2 also showed this effect but to a much smaller extent, and unfortunately Observer 1 did not complete this condition. So while this suggests a possibly interesting effect, we do not have the power to draw a strong conclusion. In some cases, transducers were more compressive in the achromatic than chromatic cases, especially in the low-spatial-frequency flicker condition, but this was not found consistently across observers. We did find several clear differences between observers, independent of stimulus type. While noise estimates from discrimination were highly consistent across observers and conditions, those from MLDS were higher for Observer 4 than other observers across all stimulus types. The shape of the transducers from MLDS were consistently more linear for Observer 3 compared to other observers, and in comparison to that observer's transducers from discrimination. Overall, we found the results of the contrast discrimination experiment to be more consistent across observers and stimulus conditions, while MLDS results were more variable (see rightmost panels of
Figure 9).
The range of physical contrasts tested in the scaling experiment varied widely across stimulus condition, as each contrast series was tailored to the stimulus based on discrimination results. To further compare the shape of model transducers, we expressed them on a contrast axis normalized to sensitivity, defined as multiples of detection threshold.
Figure 10 replots the MLDS results both on absolute and normalized axes. For the first three observers, this normalization brings the results from all stimuli into alignment, and highlights the finding that Observer 3 showed a more linear curve across all stimuli. Observer 4’s results do not align on normalized axes. The cause of these individual differences is unclear, but it is possible that observers adopted different strategies in making subjective contrast comparisons, and that Observer 4’s strategy even varied across conditions. This is consistent with the fact that model estimates of internal noise were higher and more variable for Observer 4 than others.