Classification image and other similar noise-driven linear methods have found increasingly wider applications in revealing psychophysical receptive field structures or perceptual templates. These techniques are relatively easy to deploy, and the results are simple to interpret. However, being a linear technique, the utility of the classification-image method is believed to be limited. Uncertainty about the target stimuli on the part of an observer will result in a classification image that is the superposition of all possible templates for all the possible signals. In the context of a well-established uncertainty model, which pools the outputs of a large set of linear frontends with a max operator, we show analytically, in simulations, and with human experiments that the effect of intrinsic uncertainty can be limited or even eliminated by presenting a signal at a relatively high contrast in a classification-image experiment. We further argue that the subimages from different stimulus-response categories should not be combined, as is conventionally done. We show that when the signal contrast is high, the subimages from the error trials contain a clear high-contrast image that is negatively correlated with the perceptual template associated with the presented signal, relatively unaffected by uncertainty. The subimages also contain a “haze” that is of a much lower contrast and is positively correlated with the superposition of all the templates associated with the erroneous response. In the case of spatial uncertainty, we show that the spatial extent of the uncertainty can be estimated from the classification subimages. We link intrinsic uncertainty to invariance and suggest that this signal-clamped classification-image method will find general applications in uncovering the underlying representations of high-level neural and psychophysical mechanisms.

One type of nonlinearity that does pose a problem for the noisy cross-correlator

^{1}model is stimulus uncertainty. Even when observers are told the exact shape and location of the signals that they are to discriminate between, they sometimes behave as if they are uncertain as to exactly where the stimulus will appear or what shape it will take (e.g., Manjeshwar & Wilson, 2001; Pelli, 1985). We can model spatial uncertainty by assuming that the observer has many identical templates that he applies over a range of spatial locations in the stimulus, but the effects of this operation are complex, and it is not obvious precisely how a classification image is related to the template of such an observer, or how the SNR of the classification image is related to quantities such as the observer's performance level or internal-to-external noise ratio. If an observer is very uncertain about some stimulus properties, such as the phase of a grating signal, a response classification experiment may produce no classification image at all (Ahumada & Beard, 1999).

*is*generally applicable even when the task, the visual system, or both possess a great deal of uncertainty (and invariance). This is achieved by understanding the role that a signal plays in a classification-image experiment in the context of a well-established uncertainty model (cf. Pelli, 1985), first proposed by Tanner (1961). Specifically, we will demonstrate the theoretical feasibility and empirical practicality of recovering the perceptual templates of an observer for tasks with a high degree of spatial uncertainty. We will also demonstrate how the degree of uncertainty may be estimated from the resulting classification images.

*T*

_{ r,j}be the

*j*th version of a noise-free contrast pattern with a response label

*r*. (Unless the context suggests otherwise, we generally present a 2-D pattern as a column vector by concatenating all columns of an image into a single column.) Let

*N*

_{ σ}be a sample of a Gaussian white noise (a multinomial normal distribution of zero mean and diagonal covariance

*σ*

^{ 2}

*I*). A noisy stimulus with a signal contrast of

*c*is

*I*with maximum accuracy is to select the response label

*r*that maximizes the posterior probability (Duda & Hart, 1973; Green & Swets, 1974; Peterson, Birdsall, & Fox, 1954). That is,

*j*(marginalization) in the second expression follows strictly from probability theory because the occurrence of the different versions of a pattern is mutually exclusive in a single presentation.

*r*or

*j,*we have:

*M*is the number of distinct patterns with the same response label, the

*k*s are constants, and the superscript

*T*denotes matrix transpose. We note that

*I*

^{ T}

*I*does not vary with either

*r*or

*j*and has therefore been treated as a constant.

*r*that maximizes a univariate decision variable

*λ*(

*r*):

*M*) is large (in the tens of thousands).

*T*

_{ r,j}

^{ T}

*T*

_{ r,j}is a positive constant and can be removed from the decision rule:

*M*= 1), the optimal decision rule can be further reduced to that of a linear correlator by taking advantage of the fact that the exponential function is monotonically increasing and by removing all constant terms:

*c*) or the noise variance (

*σ*

^{2}). The assumption of a linear observer is the singularly most important assumption for the classification-image method (Ahumada, 2002; Murray et al., 2002). Also evident from our derivation is the reason why uncertainty presents a significant challenge to the classification-image method—because there are no apparent means of approximating the optimal decision variable of Equation 5 to something similar to Equation 6.

*model*

^{2}of the observer. When

*M*in Equation 4 or 5 is greater than 1, the decision rule would be suboptimal for the

*task,*which has no uncertainty, but it is optimal for the

*observer*with the explicit limitation that the observer had assumed that there was uncertainty in the task. Tanner (1961) pointed out that if an observer did not know the signal exactly and had to consider a number of possibilities, the observer, which could be otherwise ideal, would have a steeper psychometric function compared with that of an ideal observer. Early studies in audition (cf. Green, 1964) and vision (e.g., Foley & Legge, 1981; Nachmias & Sansbury, 1974; Stromeyer & Klein, 1974; Tanner & Swets, 1954) found that when a subject was asked to detect a faint but precisely defined signal, the resulting psychometric function had a slope consistent with the presence of a significant intrinsic uncertainty.

*T*

_{ r}(cf. Ahumada, 2002). The same, however, could not be said when there is significant extrinsic or intrinsic uncertainty.

*while the signal is present,*we know with relative certainty that it was that channel that often responded maximally to the signal that was suppressed. The linear kernel associated with this channel can then be recovered using the conventional classification-image technique.

*T*

_{ r,z}

*,*

*z*∈ [1,

*M*] denote the channel that has the highest response for signal

*S*. The last line of approximation is justified because (1) for equal-energy signals,

*N*

_{ σ}

^{ T}

*T*

_{ r,j}is statistically identical for all channels

*j,*and (2) the term

*S*

^{ T}

*T*

_{ r,z}leads one particular channel to have the highest response most of the time and thus to single-handedly drive the decision variable

*λ*(

*r*). What is critical for this approximation is that the response

*S*

^{ T}

*T*

_{ r,z}must be significantly larger than the responses from the other channels. We refer to this requirement as the “signal-clamping” requirement and the approximation in Equation 9 as the signal-clamping approximation.

^{3}With this perspective, we can think of the two-bar method as using one bar to select a channel of a specific phase, and the other bar, with varying positions relative to the first bar, to map the receptive field of the selected channel.

*T*

_{ r}is the position-normalized template for response

*r*and

*p*

_{ j}is a position on the display. Our goals are to recover

*T*

_{ r}and the range of

*p*

_{ j}. Possible generalizations of the signal-clamping technique to other types of uncertainty beyond that of shift invariance will be addressed in the General discussions section.

_{ AB}is the average of all the noise patterns

*N*

_{ σ}( Equation 1) from trials where the signal in the stimulus was

*A*and the observer's response was

*B*. Consider the two-letter identification task (“O” vs. “X”). The subimage CI

_{ OX}is the average of the noise patterns

*N*

_{ OX}from trials where “O” was in the stimulus but the observer responded “X” (we refer to this as an OX trial). An “X” response implies that the internal decision variable for an “X” response was greater than that for an “O” response; that is,

*λ*(“X”) >

*λ*(“O”). Appealing to the uncertainty model ( Equation 7) and the composition of a stimulus ( Equation 1) and letting

*X*

_{ j}=

*T*

_{ x,j}and

*O*

_{ j}=

*T*

_{ o,j}to improve readability, we have

*O*(without any subscript) is the “O” signal in the noisy stimulus presented to the observer. If there is no uncertainty (

*M*= 1), Equation 11 becomes the familiar form that underlies the conventional classification image:

*O*

_{1}) more than the “X” channel (

*X*

_{1}); that is,

*O*

^{ T}

*O*

_{1}>

*O*

^{ T}

*X*

_{1}. For this inequality to hold, the average noise pattern on the left-hand side must have a positive correlation with the X template and a negative correlation with the O template. Ahumada (2002) showed analytically that

*E*[·] denotes a mathematical expectation (see also Abbey & Eckstein, 2002; Murray et al., 2002). The proportional constant is affected by the probability of an OX trial (stimulus “O,” response “X”), and the internal-to-external noise ratio (ratio between the variances of the noise internal to an observer and that in the stimuli; e.g., see Equation A3 in Murray et al., 2002). CI

_{OX}approaches

*E*[

*N*

_{OX}] as the number of OX trials (

*N*

_{OX}) approaches infinity. For a finite number of trials, the variance of CI

_{OX}is rather cumbersome because the probability density of CI

_{OX}is a truncated version of the multidimensional Gaussian (

*N*

_{σ}) used to form the stimuli. Ahumada (2002) pointed out that the variance of CI

_{OX}is upper bounded by the variance of the nontruncated distribution. Murray et al. (2002, Appendices A and F) further argued that the difference between the upper bound and the actual variance is negligible for a typical classification-image experiment where (1) the amount of the stimulus noise is comparable to the level of the observer's internal noise, (2) the number of independent image pixels (and hence the dimensionality of stimulus) is large, and (3) the accuracy level is above 75%. All of the experiments in the current study met these three conditions. Thus, CI

_{OX}can be approximated as

*N*

_{σ}is a sample of white noise from the distribution used to form the stimuli (Equation 1).

*X*s and

*O*s were always presented at the same position on the display) but with a significant amount of intrinsic uncertainty (

*M*≫ 1). Applying the signal-clamping approximation ( Equation 9) to the right-hand side of Equation 11, we have

*λ*(“O”) because the “O” signal in the stimulus consistently biases one particular “O” channel (

*O*

_{ z}in the equation). There is no such trial-to-trial consistency among the “X” channels because none of them are tuned to the “O” signal. Hence, the signal-clamping approximation does not apply to

*λ*(“X”). Following the logic of Ahumada (2002), we can show that (1)

*E*[

*N*

_{XO}]) and the classification subimage (CI

_{OX}) remains the same as stated in Equation 14.

*O*

_{ z}in Equations 15 and 16) that would otherwise be responding.

*E*[

*X*

_{ j}] of Equation 16) corresponds to the average of all the channels associated with the response (“X” in our example). As a result, there will not be any clear positive image in the classification subimages when there is significant intrinsic uncertainty. The clarity of the positive image provides a way to estimate the degree of uncertainty.

*S,*with each pixel corresponding to a location in the image and the pixel value representing the probability of a channel at the location responding erroneously to noise, then

*X*

_{ z}is the position-normalized template for “X”. Combining Equations 16 and 17, we have

*S*can be parameterized with a small number of parameters (e.g.,

*S*being a square region with uniform distribution), then Equation 18 provides a way to estimate both the perceptual templates and the amount of spatial uncertainty. We can obtain these estimates in stages. The classification subimage CI

_{ OX}, which, in the limit, approaches

*E*[

*N*

_{ OX}], contains a negative image of the “O” template, unaffected by uncertainty. Likewise, the subimage CI

_{ XO}provides a direct estimate of the “X” template. Knowing both the “O” and “X” templates, Equation 17 and the corresponding equation for

*E*[

*N*

_{ XO}] can be used to estimate the spatial uncertainty

*S*.

*S*appears robust, particularly if it can be parameterized with very few parameters.

_{ OX}) can be calculated by averaging the noise patterns in the conventional matter:

*i*was created by shifting the stimulus

*O*

_{1}by an amount

*p*

_{ i}:

*N*

_{ OX}in all of the preceding equations with a shifted version

^{ S}

*N*

_{ OX}, where

*M*= 1), the templates were positioned to have the maximum overlap with the signal. In case of high spatial uncertainty, the center position of a template is uniformly distributed within the center 64 × 64 pixels of the image. There were 1,000 spatially shifted templates for each response (

*M*= 1,000). The relative positions of the signals and the templates are shown in Figure 1a. For each trial, the observer model made a decision according to Equation 5. The external noise had a variance of 1/16 (

*σ*= 0.25), identical to that used in the human experiments. The signal contrast was set to a level to obtain an accuracy of 55% correct (low contrast) or 75% correct (high contrast). The observer model was assumed to know the signal contrast (parameter

*c*in Equation 5).

_{ OX}contains a clear negative image of the template for the “o” response, which in this case was the letter “p”—the template we built into the ideal-observer model. Remarkably, this image is sharp and unaffected by the high degree of intrinsic uncertainty. This is the main result of the signal-clamping technique. Also, as predicted by Equation 16, there is no clear positive template in CI

_{ OX}

*,*which is the single most important difference between the two uncertainty levels (

*M*= 1 vs.

*M*= 1,000). It is important to reiterate the point that the negative image in CI

_{ OX}resembles the observer's template “p” and not the signal “o” that was presented. The “o” signal biased a “p” template at a particular location, allowing the effect of noise on that particular template to accumulate over all the error trials when the presented signal was “o.” The effect of the noise was on the nonzero regions of the biased template, although these regions may not overlap with the signal (e.g., the descender of the lowercase “p”).

*S*to be a uniform square region centered in the image with

*d*pixels on a side. Thus,

*N*

_{ OX}and √

*n*

_{ XO}, respectively. If we knew the observer's signal-clamped templates (

*O*

_{ z}and

*X*

_{ z}), then

*k,*and most importantly the extent of the spatial uncertainty

*d,*can be estimated from the classification subimages for the error trials by minimizing the least-squared error. The right-most column of Figure 1b plots the residual sum-of-squares error for different values of

*d*(with the value of

*k*chosen to minimize the residual at each level of

*d*). The solid green curves were obtained using the veridical observer templates (lowercase “p” and “k”). The value of

*d*at which a global minimum is achieved provides the estimate of the extent of the spatial uncertainty. The estimated values for the two levels of uncertainty are 1 and 35 pixels, respectively, and are indicated by the first character of the template label “pk.” For the high-uncertainty condition, the residual landscape suggests that although the lower bound of

*d*is well defined, the upper bound is not. In the context of this limitation, the estimated values are in good agreement with the veridical values (1 for the no-uncertainty condition and 64 for the high-uncertainty condition).

*d*computed using incorrect observer templates. Each of the three black curves was obtained with a pair of lowercase letters (except “p” and “k”) that resembled the classification subimages as the presumed observer templates. The red curve was obtained with the presented signals (“o” and “x”) as the presumed observer templates. Note that the values of

*d*at the global minimum of each of these residual curves are very similar. This result demonstrates the robustness of the estimate of the spatial extent

*d*of the underlying uncertainty, even when the observer template is not precisely known. In practice, this means that we can obtain a reasonable estimate of the spatial extent by assuming that the observer templates were identical to the presented signals.

*M*for both levels of spatial extents (

*M*= 1,000) and another with a constant density (

*M*= 1,000 for high uncertainty,

*M*= 250 for medium uncertainty).

^{4}The telltale sign of uncertainty is evident in the classification subimages for all conditions ( Figure 2b). In particular, the classification subimage from the miss trials (CI

_{miss}) shows a negative image of the observer's template (a lowercase letter “e”), whereas the subimage of the false-alarm trials (CI

_{FA}) shows only a positive haze (if there were no uncertainty, it would be a positive image of observer's template).

*C*

_{250}/

*C*

_{1,000}= 1.1) and classification images ( Figure 2b, first row, left and middle columns). This is consistent with the finding of Tjan and Legge (1998) that there exists a task-dependent upper bound of the

*effective*level of uncertainty, which can be substantially less than the highest possible level of

*physical*uncertainty. With respect to our current letter detection task, this means that increasing

*M*beyond a density of 250 possible positions per 32 × 32 pixels has no consequence in performance.

_{miss}in both of the medium-uncertainty conditions. This dark haze was absent in the high-uncertainty condition.

_{FA}is noticeably broader and fainter in the high-uncertainty condition compared with the medium-uncertainty condition.

*d*) of the uncertainty. Note that for a detection task, one of the templates (

*X*in this case) is an image of zeros; that is,

*d*is plotted in the second row of Figure 2b. As with the case of the letter identification simulation, the green curve represents using the veridical observer template (“e”) to perform the estimation, the red curve represents using the signal in the stimuli as the template, and the three black curves were obtained using other lowercase letters that resembled the classification subimages. Again, the values of

*d*that minimize these residual functions are relatively independent of the assumed observer templates. The averaged estimated value of

*d*was 14.6 pixels for the medium-uncertainty condition and 37.4 pixels for the high-uncertainty condition. Although showing the same ratio of difference as the veridical values (32 vs. 64 pixels, respectively), the estimated values are admittedly a factor of 2 less. This is probably because the simulation used only 1,000 positions within

*S*

_{ d}, as opposed to a true uniform distribution of positions.

- Each of the classification subimages from the error trials contains a clear negative image of the observer's template for the presented signal, unaffected by spatial uncertainty intrinsic or extrinsic to the observer. However, in the presence of uncertainty, the clarity of the template image markedly deteriorates if the contrast of the presented signal is not sufficiently high. The need for a high-contrast signal goes opposite to the conventional practice of using a low-contrast signal to increase the effect of noise on the observer's response.
- Any positive image of the alternative template in a classification subimage for the error trials is blurred by spatial uncertainty, often rendering it indiscernible.
- The extent to which these positive template images are blurred provides an estimate of the spatial extent of the uncertainty.
- Because of the presence of a relatively strong signal in the stimulus, the classification subimages from the correct trials contain very little contrast and are relatively uninformative. As a result, we do not advocate combining the subimages to form a single classification image as in the conventional approach.

*μ*deg

^{2}. The mean luminance of the noisy background was 19.8 cd/m

^{2}.

*C*as:

*T*′ is an assumed template and

*σ*

_{C}is the pixel-wise standard deviation of the image

*C*. Murray et al. showed that the discrepancy between

*T*′ and the observer's actual template only leads to a reduction in the amplitude of rSNR by a constant factor relative to the inherent variability of a classification image, thereby making the measurement less reliable. We modified this approach to measure only the classification subimages of the error trials (e.g., CI

_{OX}and CI

_{XO}for the letter identification experiment) and only the negative template images in these subimages.

*X*and

*O*are the presented letter stimuli. Equation 26 is applicable to the letter detection task by setting

*X*to zero. In essence, Equation 26 measured the SNR of the pixels that overlap the negative O template in the subimage CI

_{ OX}and the negative X template in the subimage CI

_{ XO}.

_{ OX}and bottom left—CI

_{ XO}). The most crucial finding is that across uncertainty conditions, there was little or no difference between the negative components of the error-trial classification subimages. This was true both within and between subjects, confirming the general validity of the signal-clamping approximation ( Equations 9 and 15).

*d*in terms of stimulus pixels, using the lowercase stimuli as the presumed templates. As demonstrated in the simulation, the choice of the presumed templates, which may be different from the actual observer templates, does not significantly affect the estimated value of

*d*. The residual landscapes are plotted in the middle column of Figure 4. The standard error of the estimate was determined by bootstrapping (Efron & Tibshirani, 1994). The results are summarized in Table 1. As expected, the estimated spatial extent (

*d*) of the combined uncertainty (extrinsic and intrinsic) was significantly higher in the high-uncertainty condition than the no-uncertainty condition. Moreover, these values are in reasonable agreement with the veridical values (1 for the no-uncertainty condition and 64 for the high-uncertainty condition).

Condition | Subject | d ± SE |
---|---|---|

No uncertainty | B.B. | 5 ± 1.0 |

A.O. | 9 ± 10.4 | |

High uncertainty | A.O. | 35 ± 4.6 |

A.S.N. | 51 ± 8.5 |

*d*as opposed to

*M*) can be estimated from the classification images. This prediction was tested in Experiment 2.

_{miss}(the subimage from the miss trials) in the high-uncertainty condition. Also, as predicted, there was no clear image of the target in CI

_{FA}(the subimage from the false-alarm trials). The positive haze in CI

_{FA}is not as pronounced as that in the simulation, probably due to the presence of internal noise and intrinsic spatial uncertainty. The presence of a significant amount of intrinsic uncertainty in the observers may also explain the absence of any blurring of the negative template image in CI

_{miss}in the medium-uncertainty condition, which was observed in the simulation.

_{FA}is more visible for the medium-uncertainty condition if we blur the subimages (using a Gaussian kernel with a space constant of 14.1 stimulus pixel, right column of Figure 6). Such a positive haze around the center of the image appears to be absent from CI

_{FA}in the high-uncertainty condition.

*d*(middle column of Figure 6) and summarized in Table 2. These results were obtained by fitting Equation 24 to the classification subimages, using the target letter “o” as the presumed observer template. The spatial extent of the uncertainty (

*d*) was significantly higher in the high-uncertainty condition as compared with the medium-uncertainty condition, both within and between subjects. The standard errors were estimated with bootstrap.

Condition | Subject | d ± SE |
---|---|---|

Medium uncertainty | M.J. | 31 ± 22 |

J.H. | 31 ± 13 | |

High uncertainty | J.H. | 127 ± 45.3 |

B.B. | 65 ± 28 |

Condition | Subject | d ± SE |
---|---|---|

Fovea, no stimulus uncertainty ( Experiment 1) | A.O. | 9 ± 10.4 |

B.B. | 5 ± 1.0 | |

Periphery, no stimulus uncertainty | B.B. | 67 ± 31 |

A.S.N. | 29 ± 9.4 | |

Fovea, high stimulus uncertainty ( Experiment 1) | A.S.N. | 51 ± 8.5 |

A.O. | 35 ± 4.6 |

*d*.

*or*feature “b” will lead to a error response, and averaging such noise samples will reveal both features “a” and “b.”

*a*AND

*b*) = (NOT(

*a*) OR NOT(

*b*))], revealing the complete signal that the mechanism is tuned to as a negative image in the classification subimage from the error trials.

*suppressed*a spike.

*d*′ vs. signal contrast) of such a mechanism is nonlinear, as opposed to the linear psychometric function of a mechanism that represents “x” as a single template. Unfortunately, linearity of a psychometric function is nondiagnostic in practice because other factors, such as other types of uncertainty, can also lead to a nonlinear psychometric function. In fact, the psychometric function of a human observer is rarely linear for just about any task tested.

*d*′ vs. signal contrast) of such a mechanism is nonlinear, as opposed to the linear psychometric function of a mechanism that represents “x” as a single template. Unfortunately, linearity of a psychometric function is nondiagnostic in practice because other factors, such as other types of uncertainty, can also lead to a nonlinear psychometric function. In fact, the psychometric function of a human observer is rarely linear for just about any task tested.

*M,*the number of orthogonal channels possessed by the observer, by measuring the extent by which the psychometric function (

*d*′ vs. signal contrast) of the observer deviates from linearity or, equivalently, its log–log slope deviates from unity (e.g., Foley & Legge, 1981; Green, 1964; Nachmias & Sansbury, 1974; Stromeyer & Klein, 1974; Tanner & Swets, 1954). Pelli (1985) used a Weibull approximation to the psychometric function and established via numerical simulations the relationship between the parameters of the Weibull function and

*M*. Later work (e.g., Eckstein, Ahumada, & Watson, 1997; Tyler & Chen, 2000; Verghese & McKee, 2002) departed from the Weibull approximation and/or derived analytically the relationship between

*M*and the parameters of a psychometric function. All these approaches assumed the Max-rule model of uncertainty (observer's response is determined by the maximally responding channel) and that the

*M*channels are orthogonal. Most critically, these approaches treat uncertainty in a generic sense and make no distinction regarding the feature dimension of the uncertainty. For example, uncertainty about a signal's position is not distinguished in these formulations from uncertainty about its orientation. All types of uncertainty are characterized in terms of

*M*—the equivalent number of orthogonal channels that the observer possesses.

*M*= 250) to 64 × 64 pixels (

*M*= 1,000), the rSNR of the model's classification images increased from 627 to 751, whereas the model's log threshold contrast increased from −1.29 to −1.14 (a factor of 1.4 in contrast). When there was no uncertainty, the model rSNR was 2,020 (not shown in Figure 2) at a log threshold contrast of −1.48. This U-shape function of rSNR in terms of uncertainty was also evident for the letter discrimination task ( Figure 1). The rSNRs of the model's classification images were 1,180, 856, and 973 for spatial extents of 1 × 1, 32 × 32 (

*M*= 250, not shown in Figure 1), and 64 × 64, respectively.

*M*= 1,000). This pattern of results can be reconciled with data from the ideal-observer models by noting that intrinsic spatial uncertainty was always present in the human observers, even when there was no uncertainty in the stimulus ( Table 1). Such intrinsic uncertainty might place human data on the increasing portion of the U-shape function. In addition, internal noise in human observers may also play a role. A more thorough analysis of rSNR versus intrinsic uncertainty in human observers awaits future studies.

*N*

_{ OX}from the OX error trials, where the signal was “O” but the response was “X,” has the mathematical expectation as described in Equation 16, where

*O*

_{ z}is the channel that is tuned to the presented “O” signal,

*X*

_{ j},

*j*∈[1,

*M*] are the channels that are tuned to the possible signals for the “X” response, and

*E*[

*X*

_{ j}] denotes the average across all

*X*

_{ j}s. Our starting points are (1) the result from Ahumada (2002) for

*M*= 1 (Equation 13) and (2) the internal decision variable of the observer during these trials, with the signal-clamping approximation applied (Equation 15).

*M*. The case of

*M*= 1 is true from Ahumada (2002) (i.e., Equation 13). Assuming

*M*=

*k*is true, we consider the case of

*M*=

*k*+ 1. Let

*v*be the number of trials where

*X*

_{j},

*j*≤

*k,*were the maximum-responding X channels on the left-hand side of Equation 15. For these trials only, it was as if

*M*=

*k,*and Equation 16 is true by assumption. The sum of the noise samples from these trials is

*w*be the number of trials where

*X*

_{k+1}is the maximum-responding X channel. For these trials, it was as if

*M*= 1, for which the result of Ahumada (2002) (Equation 13) applies. The sum of the noise samples from these trials is

*v*+

*w*), we have

*M*=

*k*+ 1, if it is true for

*M*=

*k*. Because it is true for

*M*= 1, by mathematical induction, Equation 16 is true for all

*M*≥ 1.

*s*(

*x,y*)·

*t*(

*x,y*) = ∫∫

*s*(

*x,y*)

*t*(

*x,y*)d

*x*d

*y*. Hence, the result of a cross-correlation is a scalar. Yet, in mathematics, cross-correlation is defined as a convolution with a flipped and conjugated kernel. For a real function of two dimensions, this is

*s*(

*x,y*)⊗

*t*(

*x,y*) = ∫∫

*s*(

*x*−

*u,y*−

*v*)

*t*(−

*u,*−

*v*)d

*u*d

*v*. Hence, the result is not a scalar but a function of (

*x,y*). This confusion is particularly unfortunate when we try to describe a mechanism that is shift invariant (i.e., the mechanism is a cross-correlation in the second but not the first sense of the term). Murray et al. (2002) used the term in the first sense (dot product). In this paper, we will avoid the use of the term cross-correlation all together. We will use “correlation” when referring to a dot product and describe cross-correlation in terms of convolution.