Free
Research Article  |   March 2002
Classification image weights and internal noise level estimation
Author Affiliations
Journal of Vision March 2002, Vol.2, 8. doi:10.1167/2.1.8
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Albert J. Ahumada, Jr.; Classification image weights and internal noise level estimation. Journal of Vision 2002;2(1):8. doi: 10.1167/2.1.8.

      Download citation file:


      © 2016 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

For the linear discrimination of two stimuli in white Gaussian noise in the presence of internal noise, a method is described for estimating linear classification weights from the sum of noise images segregated by stimulus and response. The recommended method for combining the two response images for the same stimulus is to difference the average images. Weights are derived for combining images over stimuli and observers. Methods for estimating the level of internal noise are described with emphasis on the case of repeated presentations of the same noise sample. Simple tests for particular hypotheses about the weights are shown based on observer agreement with a noiseless version of the hypothesis.

Symbols in Order of Appearance
  •  
    m the number of image components
  •  
    ss s = 0, 1; 1 by m signal vectors
  •  
    ps s = 0, 1; probability of singal ss
  •  
    n 1 by m noise vector with components n(i), i = 1, m
  •  
    g 1 by m trial stimulus vector with components g(i), i = 1, m
  •  
    E[·] averaging or expectation operator
  •  
    Var[·] variance computing operator
  •  
    σ2 variance of n(i)
  •  
    w 1 by m classification vector with components w(i), i = 1, m
  •  
    β bias of linear classifier
  •  
    R the observer’s response, 0 or 1
  •  
    T matrix transpose operator
  •  
    ‖·‖ vector length, ‖w‖ = (w wT)1/2
  •  
    Pr{} probability of enclosed event
  •  
    ps, R probability of response R given signal ss, Pr{R|ss}
  •  
    Φ(·) cumulative standard normal distribution function
  •  
    d0′ sensitivity of linear classifier
  •  
    β0 shifted bias of linear classifier, β − w s0T
  •  
    Z(·) functional inverse of the cumulative standard normal distribution function, Φ−1(·)
  •  
    wI classification vector w of the ideal observer
  •  
    dI′ sensitivity of the ideal observer
  •  
    ρ2 the sampling efficiency of w, ρ = w wIT
  •  
    β0, H random shifted bias of the human observer model
  •  
    γ2 variance of β0, H
  •  
    α2 proportion of external noise in the classification variable, 1/(1 + γ2)
  •  
    dH′ sensitivity of the human observer model
  •  
    βH performance bias of the human observer model
  •  
    φ(·) standard normal distribution density function
  •  
    ns, R a random noise n conditional on signal ss and detection response R
  •  
    as, R the average of Ns, R noises ns, R
  •  
    vs, R the expected value of ns, R when m = 1
  •  
    x, y, z standard normal variables
  •  
    U an orthonormal m by m transformation
  •  
    I the m by m identity transformation
  •  
    zi standard normal variables
  •  
    Ns, R the number of presentations of stimulus s, that led to Response R
  •  
    Ns the number of presentations of stimulus s, Ns = Ns, 0 + Ns, 1
  •  
    e the decision contribution from the external noise, replacing w nT
  •  
    MR R = 0, 1; the event that an internal-noise-free model made response R
  •  
    pM, s, 0 the probability of event M0 given that the signal was ss
  •  
    βM, s the signal-dependent, internal-noise-free model criterion, β0 if s0, or β0 − d0′ if s1
1 Historical Introduction
In 1965, a frustrated graduate student in physiological psychology was looking for a thesis topic in the auditory research laboratory of E. C. Carterette and M. P. Friedman, the editors to be of the Handbook of Perception. They recommended that he tape record the stimulus of the traditional tone-in-noise yes-no detection experiment and analyze the sounds in the four different types of trials to determine whether correlates could be found in the stimuli relating to the observer responses. The noise masker was continuous wide-band noise, and marker tones were recorded on a second track to keep track of the signals presented. The tapes were digitized and analyzed, but the signal-to-noise level at threshold was so low that no trace of the signals could be found in the digitized records. To ensure earning a degree in the foreseeable future, the student made several changes in the experiment. To improve the signal-to-noise ratio on the tape, the noise bandwidth was narrowed, and the noise was turned on only during the short interval when the signal might be present. To reduce the effects of observer noise, the tape was repeatedly presented to the observer to get average ratings of signal presence. To minimize degrees of freedom in the stimulus measurement, the stimulus was reduced to the energy passed by a filter tuned to the signal tone frequency. This combination of changes allowed the student to find that on signal trials, very narrow filter outputs correlated best with observer ratings, whereas on noise trials, wider filter outputs correlated best, contradicting the prediction of single linear filter models for auditory tone detection (Ahumada, 1967). 
To gain better control of the masking noise and avoid the limitations of tape recording, Ahumada and Lovell (1971) used computer-generated tones and noises defined by their Fourier component amplitudes and reported linear regressions on the component energies with average observer ratings. These results were essentially auditory classification images that again demonstrated results contrary to simple linear filter theory: frequency components were weighted differently on signal trials from noise-only trials and negative weights were frequently observed. The results of both experiments seemed to be consistent with models with multiple linear channels that were being nonlinearly combined. Ahumada, Marken, and Sandusky (1975) extended the experiment to the combined time and frequency domains with similar results. 
Our first visual classification images (Ahumada, 1996) were done to see whether the method we had used in audition could be used to elucidate the features used by observers to accomplish a vernier acuity task. Figure 1 shows a raw classification image and the same image smoothed and quantized so only weights that are significantly different from zero are colored differently from the gray background. The ideal observer would have only weights on the right side, the side of the line that was either even with or one pixel higher than the left line. Spatial position uncertainty was presumably responsible for the observer needing to compare the two lines and for blurring the image more than optical blurring would predict. Theories that postulate that the response would be determined by the output of a single best-discriminating Gabor-like filter (Findlay, 1973; Foley, 1994) are not supported by the appearance, but were not tested statistically. Beard and Ahumada (1998) wanted to see whether observer performance was best characterized as orientation discrimination based on an oriented filter output or a local position measurement (Waugh, Levi, & Carney, 1993). The question was left unanswered; the linear classification functions obtained from the abutting stimuli were consistent with possible implementations of either theory. 
Figure 1
 
A raw classification image (top) and the same image smoothed and quantized (bottom), so only weights significantly different from zero are colored differently from the gray background. The black squares on the sides show the heights and positions of the fixed line (left) and the variable line offset (right). The dark lines on the top and bottom show the lengths and positions of the lines. The observer was A.J.A., who ran 1,600 trials (Ahumada, 1996).
Figure 1
 
A raw classification image (top) and the same image smoothed and quantized (bottom), so only weights significantly different from zero are colored differently from the gray background. The black squares on the sides show the heights and positions of the fixed line (left) and the variable line offset (right). The dark lines on the top and bottom show the lengths and positions of the lines. The observer was A.J.A., who ran 1,600 trials (Ahumada, 1996).
The first visual classification images were linear combinations of four averaged noise images, one for each of the four stimulus-response categories. For a given stimulus, the average of all the added noises has zero mean, so the sum of noises from one response class has an expectation equal to the negative of the expectation of the sum of the noises from the other response class, so we knew to combine the two response noise images with opposite sign. It appeared in the initial images that the error images were clearer than the correct response images, so we took the difference of the averages rather than the sums, realizing that this was an arbitrary decision. We also arbitrarily combined the images from the two stimuli with equal weight to get a single overall image. By symmetry, this must be the right weighting to use if the observer is making the same number of errors to equal numbers of each kind of stimulus, which was approximately the case. In the next section, there is an analysis showing that for a simplified theoretical situation, it is possible to show that the averaging is nearly optimal and to find expressions for good weighting functions for the cases of unequal stimulus presentation rates and unsymmetrical response biases. The beginning of the section introduces notation for a standard signal detection experiment as analyzed by Green and Swets (1966)
2 Template Estimation for Linear Classification of Two Signals in Additive White Gaussian Noise
2.1 The Signals and Noise
s 0 and s1 are 1 by m signal vectors, presented with probabilities p0 and p1 = (1 − p0) for N trials. On each trial, a random noise sample vector n is added to the signal, so the trial stimulus   or  
(2.1.1)
n is a 1 by m vector of independent samples of identically distributed Gaussian variables n(i) with  
(2.1.2)
and  
(2.1.3)
where E[·] is the averaging or expectation operator and Var[·] computes the variance. Without loss of generality, we can assume that the noise has been normalized by its standard deviation so that  
(2.1.4)
 
2.2 The Linear Observer Model
The linear observer classifying the noisy signals by responding R = 0 or R = 1 or would use a vector w and respond R = 1 if and only if  
(2.2.1)
where β is a response criterion and T indicates the matrix transpose operator, so that   Also, without lack of generality, we will assume that w has unit length (w and β have already been divided by the length of w) so that  
(2.2.2)
 
The performance of an observer is characterized by the error rates   the probability of signal R = 1, and   the probability of signal s1 being followed by response R = 0. 
For the linear classifier with vector w and criterion β, w nT is Gaussian with mean zero and unit variance. Hence   and  
(2.2.3)
where Φ(·) is the cumulative standard Gaussian distribution function. 
If we define sensitivity and bias parameters  
(2.2.4)
and  
(2.2.5)
then the error rates are  
(2.2.6)
and  
(2.2.7)
 
These parameters can be found from the error rates as  
(2.2.8)
and  
(2.2.9)
where   is the functional inverse of the cumulative standard normal distribution function Φ(·). 
2.3 The Ideal Observer
The ideal observer classifying the noisy signals as R = 0 or R = 1 would use the linear classifier  
(2.3.1)
For the ideal observer,  
(2.3.2)
 
The efficiency of a non-ideal linear classifier is given by  
(2.3.3)
the square of the correlation between the actual and the ideal classifier coefficients, sometimes called the sampling efficiency. 
2.4 A Noisy Human Observer Model
Human observers classify the same images different ways on different presentations. This is modeled here by assuming that the observer’s criterion β0, H (corresponding to β0) is a normally distributed random variable with  
(2.4.1)
and  
(2.4.2)
independent of the noise n. It does not matter whether the variability is added to the criterion or the classification function value. Because the noiseless criterion β0 was defined as the criterion for a variable with unit variance, the parameter 1 + γ2 can be interpreted as the total variance of the classification variable and  
(2.4.3)
as the proportion of variance in the classification variable that arises from the external noise n. The error probabilities are now  
(2.4.4)
and  
(2.4.5)
 
If we define the observer’s sensitivity and biases as  
(2.4.6)
and  
(2.4.7)
then we can compute these parameters from the human model observer error rates as  
(2.4.8)
and  
(2.4.9)
 
The efficiency of the human observer model is  
(2.4.10)
Because ρ2 ≤ 1, a lower bound for α is given by  
(2.4.11)
and an upper bound for γ2 is given by  
(2.4.12)
 
These bounds are reached when w is wI, and the inefficiency is only the result of the internal or criterion noise. 
2.5 The Classification Images
The classification image components are the four average noises as, R, the averages of the noises n for the trials segregated by signal ss and detection response R. We would like to find the mean and the variance of the pixels of as, R as a function of the parameters (s1, s0, w, β0, and γ or α). 
2.5.1 The single pixel case
In the single pixel (m=1) case, we seek the mean of a single Gaussian variable n that has been truncated by a random criterion βH. Let ns, R be the truncated variable when s was the stimulus and R was the response and  
(2.5.1.1)
Then, because ‖w‖ = 1 w = ±1. We can assume without loss of generality that s1 is greater than s0 and the sign of w is set to maximize correctness, so that W = 1. Hence,   if and only if  
(2.5.1.2)
So in the case that s = R = 0,   and for the other cases  
(2.5.1.3)
 
2.5.2 Single pixel, no noise
Consider now the single pixel case when there is no noise in the criterion (β0, H = β0).  
(2.5.2.1)
where φ(z) is the standard normal density function and the integration of z exp(z2/2) is enabled by the variable substitution   Similarly,  
(2.5.2.2)
 
2.5.3 Single pixel, noisy criterion
The Gaussian criterion case can be reduced to the fixed criterion case by a change of variables. Let z be the standard Gaussian used to form the criterion β0, H, so that  
(2.5.3.1)
Then  
(2.5.3.2)
If we let  
(2.5.3.3)
and  
(2.5.3.4)
the new variables x and y are independent E[x y] = 0), standard (E[x] = E[y] = 0, Var[x] = Var[y] = 1) Gaussian variables. These variables have the properties that  
(2.5.3.5)
and that   if and only if  
(2.5.3.6)
So  
(2.5.3.7)
 
The effect of the criterion noise on v0, 0 is to reduce it by the factor α. 
Similarly,  
(2.5.3.8)
because p0, 1 = 1 − p0, 0 and  
(2.5.3.9)
If false alarms are less frequent than correct rejections (p0, 1 < p0, 0), then  
(2.5.3.10)
a larger absolute expected value on a false alarm than a correct rejection trial. The signal case is the same with the criterion changed to β0, H − d0′ so that  
(2.5.3.11)
and  
(2.5.3.12)
Again, if misses are less frequent than hits (p1, 0<p1,1), then  
(2.5.3.13)
a larger absolute expected value on a miss than a hit trial. Regardless of the signal, the expected value depends only on the response proportion and the criterion variability. 
2.5.4 The multiple pixel case
Another independent variable transformation allows the single pixel case result to solve the multiple pixel case. Let us first examine the means and variances of the pixels of n0, 0. For any vector w of unit length, it is possible to construct an orthonormal transformation U whose first row is w, that is  
(2.5.4.1)
such that  
(2.5.4.2)
where I is the identity transformation (the transpose of U is its inverse). 
When this transformation is applied to nT we get a new noise U nT whose distribution is the same as that of nT, but whose first pixel is w nT. On an s0 trial, a noise vector n will be classified as n0, 0 if and only if the first pixel of U nT,  
(2.5.4.3)
The rest of the pixels (z2, ..., zm) of U nT are independent standard Gaussian variables (with mean zero and variance 1).  
(2.5.4.4)
 
A similar argument for the other cases leads to the general result that  
(2.5.4.5)
The mean of a classified noise is proportional to the classifying vector w. The variance of individual elements of ns, R,  
(2.5.4.6)
which is bounded by Var[z1|ss, R] and one. Truncation of a Gaussian can only decrease the variance, so  
(2.5.4.7)
Because ‖w‖ = 1, if there are very many significant weights in w, they will have to be small, so that  
(2.5.4.8)
Let as, R be the average value of a number Ns, R of ns, R. Any combination of the form  
(2.5.4.9)
with positive weights ks, R will be an estimate of w times a positive constant. 
2.5.5 Combining the categorized noises
If we have two independent estimates b and c of the same quantity (having the same expected value, E(b)=E(c)) with variances σb2 and σc2, the linear combination of the two estimates with the same expected value and the smallest variance is  
(2.5.5.1)
That is, each estimate should be weighted inversely by its variance. 
To obtain a minimum variance estimate of w(i) from a sample with Ns, R approximately independent samples of each type, the individual estimates of w(i), ns, R(i)/vs, R should be weighted inversely by their variances, which are approximately  
(2.5.5.2)
so we should weight each ns, R(i) by vs, R. Because the weights do not depend on i, we can then just weight ns, R by vs, R. A good un-normalized estimate of the classifier w is thus given by the vs, R weighted sums Ns, Ras, R,  
(2.5.5.3)
If we replace ps, R in vs, R of Equations (2.5.3.7, 8, 11, 12) by ps, R = Ns, R/Ns, where Ns = Ns, 0, and take advantage of the fact that  
(2.5.5.4)
we obtain  
(2.5.5.5)
 
The more frequent stimulus should be given more weight and, because for p < .5, φ(Z(p)) increases monotonically, the more error-prone stimulus should be given more weight. These weights take into account all the parameters assumed to determine the observer’s performance. If both stimuli are equally frequent and the error rates are equal, the formula is proportional to  
(2.5.5.6)
the combination rule originally used by Ahumada and Beard (Ahumada, 1996; Ahumada & Beard, 1998, 1999; Beard & Ahumada, 1997, 1998, 2000). 
In the next section, we add a third subscript to refer to the observer. The good weighting scheme for combining average classification images as, R, O over responses, stimuli, and observers can be described as a sequential process. To combine over responses, just take the difference,  
(2.5.5.7)
 
To then combine images for different stimuli, weight by factors involving the relative frequencies of the stimuli and the extremeness of the error proportions for the stimuli  
(2.5.5.8)
where  
(2.5.5.9)
If estimates are to be combined over M observers, they need to be weighted by the square root of the observer’s proportion of decision variance due to the external noise, α2 = 1/(1 + γ2), and the number of trials run by the observer (which is already included here in the Ns, O).  
(2.5.5.10)
 
3 Measuring the Internal Noise
3.1 Response Agreement With the Same External Noise Sample
Estimates of the internal noise (α or γ) are needed in order to use the above formula (2.5.5.10) to combine estimates over observers. Internal noise estimates are also needed to compute the variance of an estimate of w to plan the number of trials that need to be run (see Equation 2.5.5.2). For this section, we are using the model of section 2.4, relaxing the linearity assumptions about the classification function and the external noise to the assumption that the term e = w nT is a standard Gaussian. 
The subscripts i and j are added to indicate two separate trials. The response agreement probability for a particular signal and response with the same noise is denoted by  
(3.1.1)
To obtain this probability, we will compute it conditional on the value of e and then average over the possible values of e. 
Conditional on the value of e, for s = 0, the probability of a response R = 0 is given by  
(3.1.2)
For another response to the same signal and the same noise, the criterion variability is independent, so the probability of two R = 0 responses is given by  
(3.1.3)
The probability of two correct responses Ri = Rj = 0 to the same noise is then  
(3.1.4)
and the probability of two incorrect responses RI = Rj = 1 to the same noise is  
(3.1.5)
The equations for arbitrary s can be written  
(3.1.6)
and  
(3.1.7)
 
Either of these equations can be used to solve for an estimate of γ (the same estimate results by using either one). 
3.2 Estimating Observer Noise Using a Model Classifier
Ahumada and Beard (1998) also derived other estimates for γ based on the assumption that e comes from a known model (and is distributed as a Gaussian) and that Gaussian internal noise is added by the observer. This allows falsification of the model if the estimates of γ are not consistent. The model they tested was a particular parametric linear filter estimated by Barth, Beard, and Ahumada (1999), but any model can be tested using their scheme if the performance of the noiseless model can be computed for the same noises presented to the observer. The noiseless model’s performance index, d0′, is needed as is the trial by trial agreement of the observer and the model. 
One estimate of γ is based on the ratio of the performance of the model, d0′, to that of the observer, dH′ From (2.4.6) above we have   so  
(3.2.1)
and  
(3.2.2)
 
Another γ estimate comes from the trial by trial agreement between the observer and the model. For a given value of e, the event M0 that the model will respond that the signal was s0 when it actually was ss will occur if e is less than the criterion for the model,  
(3.2.3)
which is β0 if s = 0 or β0 − d0′ if s = 1. The probability that the observer and the model agree on this response will be  
(3.2.4)
 
Note that the other possibilities can be computed from this and simpler ones from the relationships  
(3.2.5)
 
(3.2.6)
and  
(3.2.7)
 
The question arises as to which alternative models could or could not be rejected by this test? One reader hypothesized that any template with the same sampling efficiency (relative to the ideal observer) could not be rejected. Within the current framework, this is true for tests comparing the noise estimate based on observer self-agreement of Section 3.1 with the estimate based on the ratio of observer performance to model performance (3.2.1). However, responses from any other template of the same efficiency would correlate worse with the observer responses than the correct template, leading to a larger estimate of internal noise. Thus only the correct template can lead to agreement among all three noise estimates. 
3.3 Example of Model Testing Based on Internal Noise Estimates
This section illustrates model testing based on noise estimates using the data of observer C.S. from Ahumada and Beard (1999). The observer was detecting the presence or absence of a Gabor signal (2 cpd or 16 cpd) in noise. Figure 2 shows that the classification images for 2 cpd signal present and signal absent and for 16 cpd signal present resembled the signal, but no image emerged for the 16 cpd signal absent, a result consistent with position or phase uncertainty. 
Figure 2
 
Top. Gabor signals with spatial frequencies of 2 cpd (left) and 16 cpd (right). Center. Classification images for the signal trials with the 2-cpd target (left) and the 16-cpd target trials (right). Bottom. Classification images for the no-signal trials with the 2-cpd target (left) and the 16-cpd target (right). The observer was C.S., who ran 5,900 trials at 2 cpd and 8,000 trials at 16 cpd (Ahumada & Beard, 1999).
Figure 2
 
Top. Gabor signals with spatial frequencies of 2 cpd (left) and 16 cpd (right). Center. Classification images for the signal trials with the 2-cpd target (left) and the 16-cpd target trials (right). Bottom. Classification images for the no-signal trials with the 2-cpd target (left) and the 16-cpd target (right). The observer was C.S., who ran 5,900 trials at 2 cpd and 8,000 trials at 16 cpd (Ahumada & Beard, 1999).
Figure 3 shows estimates of the proportion of external noise α2 for this observer. Within each block of 100 trials, each stimulus was repeated twice. The circles show estimates based on the agreement of these two responses to the same stimulus. The data and calculation details are shown in the “1.” Error bars for the self-agreement estimates were computed by generating 95% confidence limits for the proportion of self-agreement and then computing the corresponding noise proportion. For both 2 and 16 cpd, the agreement estimate of α2 for signal trials is greater than that for no-signal trials. The confidence intervals overlap slightly, but the difference is significant because the data are independent. This difference rejects the linear model for both spatial frequencies, the result being consistent with position uncertainty combined with internal noise reducing response agreement for no-signal conditions relative to signal conditions at both spatial frequencies. Using the noise analysis rather than the actual agreement proportions provides compensation for different criterion positions in the two cases. 
Figure 3
 
Estimated proportions of variance due to external noise, α2, from observer self-agreement (circles) and from relative detection performance (crosses) based on the data of observer C.S. from Ahumada and Beard (1999).
Figure 3
 
Estimated proportions of variance due to external noise, α2, from observer self-agreement (circles) and from relative detection performance (crosses) based on the data of observer C.S. from Ahumada and Beard (1999).
The crosses in Figure 3 show predictions of α2 from a comparison of detection performance with that of the linear observer using the signal as the template, where the proportion of observer variance from external noise α2 is then given by the square of the ratio of the observer dH′ to the model d0′,  
(3.3.1)
 
For both spatial frequencies, the performance estimate of α2 is well below the observer self-agreement estimates, rejecting the hypothesis that the observer is using the signal as a linear template, but is noisy. 
4 Discussion and Conclusions
4.1 Classification Function Estimation
The combining rules derived in the first part of this work agree with those derived by Murray, Bennett, and Sekuler (2002) for the same case of linear discrimination of two stimuli in white Gaussian noise in the presence of internal noise. Their derivation assumes that both the external and internal noise are white, but because the internal noise participates only after its linear combination with the observer template, its covariance matrix does not matter. Abbey, Eckstein, and Bochud (1999) and Abbey and Eckstein (2000; 2001a; 2001b; 2002) consider the 2-alternative forced-choice (AFC) situation with no alternative bias, a special case of the above analysis. They also derive formulas for the case of nonwhite external noise, which theoretically reverts to the white noise analysis after a prewhitening filter is applied. Solomon (2002) also has a derivation of the expected value of a classification image based on the same transformation argument presented above. 
4.2 Internal Noise Estimation
In the auditory studies (Ahumada, 1967; Ahumada & Lovell, 1971; Ahumada, Marken, & Sandusky, 1975), multiple observations of the same stimuli provided estimates of internal noise. The observer ratings were regarded as approximately continuous variables, and standard parametric analyses provided the estimates. Burgess and Colborne (1988) solved for both the probability of observer response agreement on two repetitions of the same 2AFC stimulus and the probability of a correct response as a function of detectability (d0′) and internal noise (γ) for the unbiased observer (their Equations 4–6), obtaining results similar to those in Section 3.1. To get estimates of γ (their k), they plotted these two variables as a function of d0′ with γ as a parameter and found the γ curve that the data points fell on (their Figure 2). Richards and Zhu (1994) solved for the response correlation on two repetitions of the same trial using the model of the above Section 2 (their Theorem 3). They report this correlation as a function of both external and internal noise for the unbiased case (their Table II), pointing out that it is just a function of γ. Their results are also similar to those in Section 3.1, but they look different because the random variables are integrated in reverse order and the squared terms are recast as the variance of a dichotomous variable. Ahumada and Beard (1998) presented a method for estimating γ for the same general situation as Richards and Zhu (1994) using the same agreement measure as Burgess and Colborne (1988). Because the overall agreement measure and the response correlation measure are functions of one of the agreement measures of Equations 3.1.6 or 3.1.7 given the hit or false alarm rates, it suffices to use one of these simpler measures to solve for γ. 
4.3 Design Considerations
Beard and Ahumada (1998) arbitrarily tried to adjust the detection parameters so that the percent correct would be near 75% and wanted their observers to have roughly the same number of errors for both stimuli, which were presented with equal probability. The 75% value was considered to be a compromise between trying to get as many errors as possible for the image and keeping the task easy enough so that the observer could maintain a stable template (Beard & Ahumada, 1999). 
The relation of the weights to the error rates shows that more errors are desirable if the internal noise is held constant (2.5.5.5). They increased the difficulty in two ways, decreasing stimulus duration and increasing the external noise level. Within the present model, lowering performance without increasing external noise can only be the result of a less efficient template or more internal noise, so it was not a good idea for improving the quality of the template estimate. Fortunately, it had little effect, and the trials went by faster. Increasing the level of external noise when the observers were performing very well was successful, and did not degrade observer efficiency as compared with the ideal linear observer, but increased noise when the observer was already at 75% led to decreased observer efficiency and probably did not improve the quality of the classification images. 
Acknowledgments
I am grateful to Bettina L. Beard and Andrew B. Watson for their support. The paper was greatly improved by thoughtful anonymous reviewers. This research was supported by NASA RTOP 548-51-12. Commercial relationships: None. 
Appendix
The following is a MatLab program for calculating the example illustrated in Figure 3
% data from Ahumada and Beard (1999) web page 
% calculations shown only for 2 cpd data 
% S0 S1 
% R0 R1 R2 R0 R1 R2 
pCS02 = [911 417 149 180 352 941]; 
pCS16 = [1290 552 115 370 657 1016]; 
pCS = pCS02; 
pfa =(pCS(3)+0.5*pCS(2))/... (pCS(3)+pCS(2)+pCS2(1)); 0.2420; pht = (pCS(6)+0.5*pCS(5))/... (pCS(6)+pCS(5)+pCS(4)); 0.7583; 
dH = znorm(pht)−znorm(pfa);1.4006; 
d0 = 3.2006;% stimulus snr 
se2d=(dH/d0)^2; 0.1915; 
c = −znorm(pfa); 0.6997; 
nPairs(1)=(pCS(3)+pCS(2)+pCS(1)); 1477 
nPairs(2)=(pCS(6)+pCS(5)+pCS(4)); 1473 
sameProp(1)=(pCS(1)+pCS(3))/nPairs(1); 0.7177 
sameProp(2)=(pCS(4)+pCS(6))/nPairs(2); 0.7610 
for s = 1:2 
p=sameProp(s); 
n=nPairs(s); 
confIntProp=1.96*sqrt((p−p*p)/n); 
confInterval(s,:)=[p, p−confIntProp, p+confIntProp]; 
end; % for s 
[0.7177 0.6947 0.7406 0.7610 0.7393 0.7828]; 
% solve for internal noise from psame, c, m 
n = 25;% integration approx. parameter 
range = 4.0;% ditto 
f=inline(‘(psame-probSame(si,c,m,n,range))^2’,... ‘si’,‘psame’,‘c’,‘m’,‘n’,‘range’); 
×1=0.001;% zero won’t work 
×2=1; 
ms= [0.0 dH];% no signal, signal 
for s=1:2 
m = ms(s); 
for i=1:3 
psame=confInterval(s,i); 
sfinal(s,i)=fminbnd(f, x1,x2, 
[],psame,c,m,n,range); 
end;% for i 
end;% for s 
se202 =1-sfinal.*sfinal; 
[0.3904 0.2931 0.4819 0.5564 0.4755 0.6313]; 
se216=... 
[0.2942 0.1938 0.3880 0.4462 0.3811 0.5087]; 
plot(1,se202(1,1),‘ko’,[1 1],se202(1,2:3),‘k.-’,... 
3,se202(2,1),‘ko’,[3 3],se202(2,2:3),‘k.-’,... 
2,se2d02,‘kx’,... 
5,se216(1,1),‘ko’,[5 5],se216(1,2:3),‘k.-’,... 
7,se216(2,1),‘ko’,[7 7],se216(1,2:3),‘k.-’,... 
6,se2d16,‘kx’); 
axis([0 8 0 0.7]); 
%%%% probSame.m 
function prob=probSame(si,c,m,n,range) 
% Prob(same) from si, c, m 
se=sqrt(1.−si*si); 
a=(m−c)/si; 
b=se/si; 
s =[−range:1/n:range]; 
p=normalcdf(a+b*s); 
p=(p.*(1.−p)).*exp(−0.5.*s.*s); 
prob=1.0–2.0*sum(p)/(n*sqrt(2*pi)); 
References
Abbey, C. K. Eckstein, M. P. (2000). Estimates of human-observer templates for simple detection tasks in correlated noise. Proceedings of SPIE, 3981, 70–77.
Abbey, C. K. Eckstein, M. P. (2001a). Maximum-likelihood and maximum a-posteriori estimates of human-observer templates. Proceedings of SPIE, 4324, 114–122.
Abbey, C. K. Eckstein, M. P. (2001b). Theory for estimating human-observer templates in two-alternative forced-choice experiments. In Insana, M. F. R., Leahy Proceedings of the 17th International Conference on Inferential Processes in Medical Imaging, Berlin: Springer-Verlag.
Abbey, C. K. Eckstein, M. P. (2002). Classification image analysis: Estimation and statistical inference for two-alternative forced-choice experiments. Journal of Vision, 2(1), 66–78, http://journalofvision.org/2/1/5/, DOI 10.1167/2.1.5. [Link] [CrossRef] [PubMed]
Abbey, C. K. Eckstein, M. P. Bochud, F. O. (1999). Estimation of human-observer templates for 2 alternative forced choice tasks. Proceedings of SPIE, 3663, 284–295.
Ahumada, A. J.Jr. (1967). Detection of tones masked by noise: A comparison of human observers with digital-computer-simulated energy detectors of varying bandwidths. Unpublished doctoral dissertation, University of California, Los Angeles. Available as Technical Report No. 29, Human Communications Laboratory, Department of Psychology, University of California, Los Angeles.
Ahumada, A. J.Jr. (1996). Perceptual classification images from vernier acuity masked by noise [Abstract], Perception, 26(Suppl. 1), 18. [Link]
Ahumada, A. J.Jr. Beard, B. L. (1998). Response classification images in vernier acuity [Abstract], Investigative Ophthalmology and Visual Science, 39(Suppl. 4), S1109. [Link]
Ahumada, A. J.Jr. Beard, B. L. (1999). Classification images for detection [Abstract], Investigative Ophthalmology and Visual Science, 40(Suppl. 4), S572. [Link]
Ahumada, A. J.Jr. Lovell, J. (1971). Stimulus features in signal detection. Journal of the Acoustical Society of America, 49, 1751–1756. [CrossRef]
Ahumada, A. J.Jr. Marken, R. Sandusky, A. (1975). Time and frequency analyses of auditory signal detection, Journal of the Acoustical Society of America, 57. 385–390. [PubMed] [CrossRef] [PubMed]
Barth, E. Beard, B. L. Ahumada, A. J.Jr. (1999). Nonlinear features in vernier acuity, Proceedings of SPIE, 3644, 88–96]. [Link]
Beard, B. L. Ahumada, A. J.Jr. (1997). Relevant image features for vernier acuity [Abstract], Perception, 26, 38. [CrossRef]
Beard, B. L. Ahumada, A. J.Jr. (1998). Technique to extract relevant image features for visual tasks. Proceedings of SPIE, 3299, 79–85. [Link]
Beard, B. L. Ahumada, A. J.Jr. (1999). Detection in fixed and random noise in foveal and parafoveal vision explained by template learning, Journal of the Optical Society of America A, 16, 755–763. [PubMed] [CrossRef]
Beard, B. L. Ahumada, A. J. (2000). Response classification images for parafoveal vernier acuity [Abstract]. Investigative Ophthalmology and Visual Science, 41(Suppl. 4), [Link].
Burgess, A. E. Colborne, B. (1988). Visual signal detection. IV. Observer inconsistency. Journal of the Optical Society of America A, 5, 617–628. [PubMed] [CrossRef]
Findlay, J. M. (1973). Feature detectors and vernier acuity. Nature, 241, 135–137. [PubMed] [CrossRef] [PubMed]
Foley, J. M. (1994). Human luminance pattern-vision mechanisms: Masking experiments require a new model. Journal of the Optical Society of America A, 11, 1710–1719. [PubMed] [CrossRef]
Green, D. M. Swets, J. A. (1966). Signal detection theory. New York: Wiley.
Murray, R. F. Bennett, P. J. Sekuler, A. B. (2002). Optimal methods for calculating classification images: Weighted sums. Journal of Vision, 2(1), 79–104, http://journalofvision.org/2/1/6/, DOI 10.1167/2.1.6 [Link] [CrossRef] [PubMed]
Richards, V. M. Zhu, S. (1994). Relative estimates of combination weights, decision criteria, and internal noise based on correlation coefficients. Journal of the Acoustical Society of America, 95, 423–434. [PubMed] [CrossRef] [PubMed]
Solomon, J. A. (2002). Noise reveals visual mechanisms of detection and discrimination. Journal of Vision, 2(1), 105–120, http://journalofvision.org/2/1/7/, DOI 10.1167/2.1.7. [Link] [CrossRef] [PubMed]
Waugh, S. J. Levi, D. M. Carney, T. (1993). Orientation, masking, and vernier acuity for line targets. Vision Research, 33, 1619–1638. [PubMed] [CrossRef] [PubMed]
Figure 1
 
A raw classification image (top) and the same image smoothed and quantized (bottom), so only weights significantly different from zero are colored differently from the gray background. The black squares on the sides show the heights and positions of the fixed line (left) and the variable line offset (right). The dark lines on the top and bottom show the lengths and positions of the lines. The observer was A.J.A., who ran 1,600 trials (Ahumada, 1996).
Figure 1
 
A raw classification image (top) and the same image smoothed and quantized (bottom), so only weights significantly different from zero are colored differently from the gray background. The black squares on the sides show the heights and positions of the fixed line (left) and the variable line offset (right). The dark lines on the top and bottom show the lengths and positions of the lines. The observer was A.J.A., who ran 1,600 trials (Ahumada, 1996).
Figure 2
 
Top. Gabor signals with spatial frequencies of 2 cpd (left) and 16 cpd (right). Center. Classification images for the signal trials with the 2-cpd target (left) and the 16-cpd target trials (right). Bottom. Classification images for the no-signal trials with the 2-cpd target (left) and the 16-cpd target (right). The observer was C.S., who ran 5,900 trials at 2 cpd and 8,000 trials at 16 cpd (Ahumada & Beard, 1999).
Figure 2
 
Top. Gabor signals with spatial frequencies of 2 cpd (left) and 16 cpd (right). Center. Classification images for the signal trials with the 2-cpd target (left) and the 16-cpd target trials (right). Bottom. Classification images for the no-signal trials with the 2-cpd target (left) and the 16-cpd target (right). The observer was C.S., who ran 5,900 trials at 2 cpd and 8,000 trials at 16 cpd (Ahumada & Beard, 1999).
Figure 3
 
Estimated proportions of variance due to external noise, α2, from observer self-agreement (circles) and from relative detection performance (crosses) based on the data of observer C.S. from Ahumada and Beard (1999).
Figure 3
 
Estimated proportions of variance due to external noise, α2, from observer self-agreement (circles) and from relative detection performance (crosses) based on the data of observer C.S. from Ahumada and Beard (1999).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×