**An extension of the signal-detection theory framework is described and demonstrated for two-alternative identification tasks. The extended framework assumes that the subject and an arbitrary model (or two subjects, or the same subject on two occasions) are performing the same task with the same stimuli, and that on each trial they both compute values of a decision variable. Thus, their joint performance is described by six fundamental quantities: two levels of intrinsic discriminability ( d′), two values of decision criterion, and two decision-variable correlations (DVCs), one for each of the two categories of stimuli. The framework should be widely applicable for testing models and characterizing individual differences in behavioral and neurophysiological studies of perception and cognition. We demonstrate the framework for the well-known task of detecting a Gaussian target in white noise. We find that (a) subjects' DVCs are approximately equal to the square root of their efficiency relative to ideal (in agreement with the prediction of a popular class of models), (b) between-subjects and within-subject (double-pass) DVCs increase with target contrast and are greater for target-present than target-absent trials (rejecting many models), (c) model parameters can be estimated by maximizing DVCs between the model and subject, (d) a model with a center–surround template and a specific (modest) level of position uncertainty predicts the trial-by-trial performance of subjects as well as (or better than) presenting the same stimulus again to the subjects (i.e., the double-pass DVCs), and (e) models of trial-by-trial performance should not include a representation of internal noise.**

*a*and

*b*, and is required to identify the category. The category identified by the subject (the subject's response) can be represented with capital letters

*A*and

*B*. Thus, on each trial there are four possibilities: The subject responds

*B*when the category is

*b*(

*B*|

*b*), the subject responds

*B*when the category is

*a*(

*B*|

*a*), the subject responds

*A*when the category is

*a*(

*A*|

*a*), or the subject responds

*A*when the category is

*b*(

*A*|

*b*). We note that the common two-alternative forced-choice task can be regarded as a special case where category

*a*is a pair of stimuli in one spatial or temporal order and category

*b*is a pair of stimuli in the opposite order.

*b*and another normal distribution if the stimulus is from category

*a*. The behavioral response is assumed to be generated by comparing the value of the decision variable on the trial to a criterion placed along the decision-variable axis. The subject makes one response if the value falls below the criterion and makes the other response if it falls above the criterion. In Bayesian statistical models, this decision variable would typically correspond to the log likelihood ratio—the log of the ratio of the probability of the modeled pattern of neural activity given category

*b*to the probability given category

*a*. However, it is important to point out that basic SDT makes no assumption about how the decision variable is computed. Its purpose is only to provide a principled interpretation of the four possible outcomes of trials in the identification task. Indeed, basic SDT can provide a principled interpretation even if there is no location within the nervous system where a single decision variable is represented.

*d*′ of standard deviations separating the two distributions, and the decision criterion

*p*(

*B*|

*b*) and

*p*(

*B*|

*a*). In the special case of a yes–no detection task these two proportions represent the proportion of hits and false alarms (Green & Swets, 1974). These two proportions correspond to the areas under the two normal distributions above the decision criterion.

*a*and another bivariate normal distribution for category

*b*(represented by the ellipses in Figure 2). All the standard deviations are 1.0, because the decision variables are normalized. The mean for category

*a*is at

*b*is at

*a*is

*b*is

*a*they are the proportion of times the subject and the model both responded

*A*,

*p*(

*AA*|

*a*); the proportion of times the subject responded

*A*and the model responded

*B*,

*p*(

*AB*|

*a*); the proportion of times the subject responded

*B*and the model responded

*A*,

*p*(

*BA*|

*a*); and the proportion of times the subject and the model both responded

*B*,

*p*(

*BB*|

*a*). Similarly, for category

*b*, the proportions are

*p*(

*AA*|

*b*),

*p*(

*AB*|

*b*),

*p*(

*BA*|

*b*) and

*p*(

*BB*|

*b*). These proportions correspond to the volume under the bivariate normal distribution within the four quadrants defined by the two decision criteria. As long as no more than one of the four proportions is zero, then it is straightforward to obtain the maximum-likelihood estimate of the DVC using Equations 3–5 (see Appendix). This estimate of the DVC is Pearson's

*tetrachoric correlation coefficient*(Pearson, 1900; Stuart & Ord, 1991) applied in the SDT framework.

*A*or

*B*) are available, as when estimating DVCs between two subjects (or within the same subject on different occasions). However, for many (but not all) models, a decision variable is explicitly computed for each stimulus presentation. In this case, more reliable maximum-likelihood estimates can be obtained by directly using the values of the model's decision variable (see Appendix). However, we emphasize that DVCs can be measured even for a model that does not produce an explicit decision variable (e.g., some neural-network models).

*b*is held fixed at 0.5 (yellow curves). The simplest correlation measure is the fraction of trials in which the subject and model are in agreement (blue curves). Perhaps the best-known and most common correlation measure for binary data is the phi correlation (also introduced by Pearson; Cramer, 1946, p. 282), which is equivalent to Matthews's (1975) correlation coefficient (orange curves). Another popular correlation measure is Cohen's (1960) kappa coefficient (gray curves). Finally, a common measure in the neurophysiology literature is choice probability (Britten et al., 1996; green curves). In all cases, the behavioral correlations vary substantially while the decision-variable correlation is held fixed. (We note that all the curves in Figure 3A are flipped about the vertical axis at 0 if the measures are computed for category

*a*rather than category

*b*.) In the extended SDT framework, the fundamental quantities are the discriminabilities, decision criteria, and DVCs. The overall accuracy levels and behavioral correlations depend in a rather complex way on all six of these fundamental quantities.

*p*(

*AB*|

*b*),

*p*(

*BA*|

*b*),

*p*(

*AB*|

*a*), and

*p*(

*BA*|

*a*). For

^{2}. The amplitude of the Gaussian target was defined to be the value at the peak of the target divided by twice the mean luminance (i.e., max amplitude = 1.0). The target and noise were presented for 250 ms in a blocked single-interval forced-choice task, with feedback. The three psychometric functions were measured three times for a total of 150 trials per target amplitude (total of 750 trials per psychometric function). The two subjects saw exactly the same stimuli, so that we could estimate between-subjects DVCs. In addition, we replicated the entire experiment with exactly the same trials a couple of weeks later (a double-pass experiment), so we could estimate within-subject DVC.

*m*.

*d*′s, criteria, and DVCs) provide a strong set of constraints on models of detection in white noise. Of course, in some experimental designs there may not be sufficient trials to reliably estimate all six quantities for each condition. Even so, estimates can be averaged across conditions and can still be useful for testing hypotheses. For example, in the demonstration experiment, computing the average DVC (across the five target amplitudes) for each of the three background contrasts would have been sufficient to provide a strong test of the model in Figure 4B—the average correlations approximately equal the square root of the average efficiencies.

*k*is the ratio of the two variances, with the sum of the variances equal to 2.0 (i.e., the average variance is 1.0). We set

*k*(e.g.,

*k*= 0.5 to 2). As the ratio of the variances deviates from 1.0 there are modest systematic deviations of the estimated discriminability and criterion from the actual values. Interestingly, there is no systematic error in the estimate of the DVC. Overall, the extended SDT framework appears to be fairly robust to violations of the equal-variance assumption.

*, 25 (Suppl. 1), 2, https://doi.org/10.1068/v96l0501.*

*Perception**, 49, 1751–1756.*

*The Journal of the Acoustical Society of America**, 13, 87–100.*

*Visual Neuroscience**, 2, 1498–1507.*

*Journal of the Optical Society of America A**, 11, 1237–1242.*

*Journal of the Optical Society of America A**, 5, 617–627.*

*Journal of the Optical Society of America A**, 214, 93–94.*

*Science**, 20 (1), 37–46, https://doi.org/10.1177/001316446002000104.*

*Educational and Psychological Measurement**. Princeton, NJ: Princeton University Press.*

*Mathematical methods of statistics**Effects of contrast gain control, background variations, and white noise,*, 14, 2406–2419.

*Journal of the Optical Society of America A**, 27, 1266–1270.*

*The Journal of Neuroscience**. Huntington, NY: R. E. Krieger Publishing. (Original work published 1966 by John Wiley & Sons, Inc.).*

*Signal detection theory and psychophysics**, 16, 235–242.*

*Nature Neuroscience**, 16, 764–778.*

*Journal of the Optical Society of America A**, 405 (2), 442–451, https://doi.org/10.1016/0005-2795(75)90109-9.*

*Biochimica et Biophysica Acta—Protein Structure**bioRxiv*, 207357, https://doi.org/10.1101/207357.

*, 11 (5): 2, 1–25, https://doi.org/10.1167/11.5.2.*

*Journal of Vision**, 195, 1–47.*

*Philosophical Transactions of the Royal Society of London, Series A**, 16, 647–653.*

*Journal of the Optical Society of America A**, 4, 171–212.*

*Trans. IRE PGIT**, 87, 411–423.*

*Neuron**.*

*Annual Review of Vision**, 2 (1): 7, 105–120, https://doi.org/10.1167/2.1.7.*

*Journal of Vision**, 70, 69–73.*

*The Journal of the Acoustical Society of America**. New York, NY: Oxford University Press.*

*Kendall's advanced theory of statistics, Vol. 2**, 31, 514–521.*

*The Journal of the Acoustical Society of America**, 6 (4): 8, 387–413, https://doi.org/10.1167/6.4.8.*

*Journal of Vision**. Oxford, UK: Oxford University Press.*

*Elementary signal detection theory**b*are

*a*they are

*b*is

*a*it is

*b*and

*a*. The maximum-likelihood estimates of the two DVCs are obtained by finding the maxima of the log likelihoods given by Equations A9 and A10:

*d*′ values + 2 decision criteria + 2 DVCs).

*i*, and let

*i*(i.e., its value in the space of Figure 2). For category

*a*, the log likelihood of all the subject's responses given the value of the model's decision variable is given by

*y*and

*x*gives Equation A17.

_{c}*b*. The main difference is that

*a*. We simulated and analyzed two kinds of observers: the ideal template-matching observer and suboptimal model subjects (see Figure 4).

*a*the two relevant distributions are

*b*they are

*a*. From Bayes's rule we have

*b*: