Figure 2 shows scatter plots of all participants' six-scale ratings, which are color coded as follows. Moving the decision criterion from left to right along the Z-axis, the five data groups are coded in red, blue, green, black, and magenta. Hit and correct-rejection rates are plotted so that the green data (when the criterion was between −1 and +1) directly inform how biased the participants were. In order to understand the results in the context of boundary extension and SDT, a different representation of the data is shown in
Figure 3 that may be more intuitive.
We computed discrimination sensitivities (d′ and area under ROC) by assuming that the noise and signal distributions were both Gaussians. Without loss of generality, we assumed that the W-W distribution was N(0, 1), and the W-C distribution was N(μ, σ). Similarly, we assumed two distributions for the C-C and C-W. The question was whether the two discrimination sensitivities thus separately obtained, measured in either d′ or area under ROC, were the same. It should be noted that whether the W-W and C-C distributions were identical in shape is unknown, but was irrelevant here. This is because in SDT calculations, the noise distribution is always normalized to be N(0, 1).
We first fitted ROC in the Z-space using each participant's rating data (
Figure 4). The mean
R2 for the linear fitting was 0.90. With the quadratic term added, 6% additional variance could be accounted for. Given that the linear fitting accounted already for 90% of the variance, we concluded that linearity was acceptable for the 24 participants' data. The average
σ calculated from the linear slope for the C-W distribution was 1.17, which was only marginally different from one,
t(23) = 1.88,
p = 0.07). The average
σ for the W-C distribution was 1.18, which was also only marginally different from one,
t(23) = 1.83,
p = 0.08; all
t tests in this paper were two-tailed. Because of the marginal significance, we decided to also compute the area under the ROC in addition to
d′ to calculate discrimination sensitivities. The areas were 0.63 and 0.68 for close and wide study image conditions, respectively. The difference was statistically significant,
t(23) = 2.52,
p < 0.02. If we assume that
d′ was definable and ignore the marginal difference above, the
d′ values were 0.43 and 0.77, giving rise to a significant difference between them,
t(23) = 2.73,
p = 0.009. In other words, the drop of sensitivity
d′ that was presumably due to boundary extension was 44%.
Figure 3 illustrates these two pairs of distributions.
Next we calculate the decision criteria for close and wide study images, respectively. We define the bias-free criterion as the intersection between the signal and noise distributions. In the case of close studied images, these two distributions correspond to C-C and C-W distributions. The bias-free criterion obtained from the hit and false alarm rate space fitting was Z = 0.21, and was Z = 0.28 from the Z-space linear fitting. The actual criterion coordinate calculated from the participants' false-alarm rates was Z = 0.63. (Here, a false alarm was defined by assuming that the decision criterion was in the middle of the six-scale.) There was therefore indeed bias, t(23) = 3.45, p = 0.002, or t(23) = 2.91, p = 0.008, in that a wider test image was more likely to be considered as the same as the closer studied image, in agreement with the boundary extension effect.
In the case of wide studied images, the bias-free criterion obtained from the rate-space fitting was Z = 0.52, and was Z = 0.55 from the Z-space linear fitting. The actual criterion calculated from the false alarm rates was Z = 0.65. This bias was not statistically significant, t(23) = 1.48, p = 0.15, or t(23) = 1.12, p = 0.27. It is interesting to note that the criterion locations in the two cases were very similar to each other (Z = 0.63 and 0.65). This makes sense because the two conditions were randomly mixed, so that it was perhaps impossible to hold two separate criteria. From this single criterion perspective, boundary extension amounted to a relative shift of the signal to the noise distribution in the condition of close study photos.
This result also raised the following question. During each test block, there were four distributions: C-C, C-W, W-W, and W-C. We the experimenters separated these four distributions into two halves (C-C and C-W, W-W and W-C), in order to calculate discrimination sensitivities. This way of separation had an assumption in it, which needed verification. The next experiment addressed this issue.