Using a linear cue combination framework, we develop a measure of selective attention that describes the relative weight that an observer assigns to attended and unattended parts of a stimulus when making perceptual judgments. We call this measure *attentional weight*. We present two methods for measuring attentional weight by calculating the trial-by-trial correlation between the strength of attended and unattended parts of a stimulus and the observer’s responses. We illustrate these methods in three experiments that investigate whether observers can direct selective attention according to contrast polarity when judging global direction of motion or global orientation. We find that when observers try to judge the global direction or orientation of the parts of a stimulus with a given contrast polarity (white or black), their responses are nevertheless strongly influenced by parts of the stimulus that have the opposite contrast polarity. Our measure of selective attention indicates that the influence of the opposite-polarity distractors on observers’ responses is typically 65% as strong as the influence of the targets in the motion task, and typically 25% as strong as the targets in the orientation task, demonstrating that observers have only a limited ability to direct attention according to contrast polarity. We discuss some of the advantages of using a linear cue combination framework to study selective attention.

*selective visual attention*. We can direct visual attention according to simple stimulus properties, such as spatial location (Posner, Snyder, & Davidson, 1980), color (Brawn & Snowden, 1999), direction of motion (Ball & Sekuler, 1981), and spatial frequency (Davis & Graham, 1981), and perhaps also according to more complex criteria, such as the perceptual segmentation of a scene (Baylis & Driver, 1992; Duncan, 1984; Egly, Driver, & Rafal, 1994). However, selective attention is sometimes imperfect: if targets and distractors differ along certain dimensions, we find that even when we try to attend only to the targets, our judgments are nevertheless influenced by the distractors. This raises the question of how targets and distractors together determine an observer’s responses, and the closely related question of how we should measure intermediate degrees of selective attention.

*T*is an internal response to targets and

*D*is an internal response to wholly or partly unattended distractors, then the observer bases his responses on a decision variable of the form

*k*measures the influence of the distractors on the observer’s responses, and we will call it the

*attentional weight*that the observer assigns to the distractors.

*k*assigned to distractors in a wide range of tasks, and we show that these methods work even when we do not know how the observer computes the internal responses

*T*and

*D*to the targets and distractors. We illustrate these methods in three experiments that investigate whether observers can direct selective attention according to contrast polarity when judging global direction of motion, or when judging global orientation. Several recent studies have investigated the first question concerning global motion and have given conflicting results (Croner & Albright, 1997; Edwards & Badcock, 1994; Li & Kingdom, 2001; Snowden & Edmunds, 1999; van der Smagt & van de Grind, 1999). The methods that we introduce avoid some of the problems of these earlier studies, and so we hope to give a more convincing answer to the question whether observers can direct attention according to contrast polarity.

*T*and

*D*, without changing the internal responses themselves. This issue is crucial for the problem of how to measure selective attention. If selective attention affects only the relative weight assigned to targets and distractors, then it can be described by a scalar, such as attentional weight. On the other hand, if selective attention qualitatively changes how an observer computes the internal responses

*T*and

*D*, then a more complex description may be necessary. We show how methods developed by Chubb and colleagues (Chubb, 1999; Chubb, Econopouly, & Landy, 1994) can be used to investigate how observers process attended and unattended stimuli, and we illustrate these methods by measuring directional selectivity for attended and partly unattended motion signals in a global direction discrimination task.

*A*and

*B*. A Bayesian decision-maker performs this task by viewing the stimulus

*U*on each trial, and evaluating the probability that the stimulus was drawn from class

*A*or class

*B*, given that the observed stimulus was

*U*. Bayes’ theorem shows that these probabilities are Equivalently, the observer can base his responses on the likelihood ratio

*L*: If stimulus types

*A*and

*B*appear equally often, and if the observer’s goal is to maximize the number of correct responses, then the optimal strategy is to respond ‘

*A*’ if

*L*≥ 1, and ‘

*B*’ otherwise (Green & Swets, 1974).

*U*is composed of many independently varying elements

*U*(e.g., a noisy

_{i}*N*pixel stimulus, or a random dot cinematogram with

*N*independent dot displacements), then the likelihood ratio

*L*is the product of many subsidiary likelihood ratios

*u*computed from the stimulus elements

_{i}*U*: Equivalently, the observer can calculate the logarithm of this likelihood ratio, which is the sum of the subsidiary log likelihood ratios: A likelihood ratio

_{i}*u*>

_{i}*1*makes it more likely that

*U*belongs to

*A*, and a likelihood ratio

*u*<

_{i}*1*makes it more likely that

*U*belongs to

*B*. A likelihood ratio

*u*=

_{i}*1*does not shift the overall likelihood ratio

*L*either way.

*L*may be correct or incorrect. Often we use a Bayesian framework to derive the

*ideal*observer for a particular task, and certainly the ideal observer must compute the relevant likelihood ratios correctly. More generally, though, a Bayesian framework allows us to model an observer’s

*beliefs*about what can be inferred from an observation, and these beliefs may be correct or incorrect. In other words, just because we describe an observer in a Bayesian framework, we need not assume that the observer follows an ideal strategy.

*U*and

_{i}*V*. When the observer selectively attends to

_{j}*U*, he takes these elements as being more relevant to the task than

_{i}*V*, and he reduces the influence of

_{j}*V*on his responses. Another way of saying this is that the observer discounts the evidence provided by

_{j}*V*, and assigns it a smaller weight in his decision. If we regard the observer as basing his responses on a likelihood ratio as in Equation 5, this amounts to his adjusting the likelihood ratios

_{j}*u*and

_{i}*ν*that are computed from the two classes of stimulus elements,

_{j}*U*and

_{i}*V*. For instance, if on a particular trial an element

_{j}*V*would contribute a likelihood ratio of

_{1}*ν*=1.2 if attended to, hence biasing the observer’s response toward ‘

_{1}*A*’, an observer who selectively attends away from

*V*can be thought of as adjusting the likelihood ratio

_{1}*ν*toward 1.0, so that

_{1}*V*has less influence on his response. That is, when the observer selectively attends to

_{1}*U*, he adjusts the likelihood ratios

_{i}*ν*by some function

_{j}*f*: We will assume that selective attention affects only the likelihood ratios

*V*corresponding to the elements

_{j}*ν*that the observer selectively attends away from. Later in this section we show that this makes our model only slightly less general than if we allow selective attention to affect both sets of likelihood ratios,

_{j}*u*and

_{i}*ν*.

_{j}*f*must satisfy a simple constraint: the likelihood ratio

*L*computed in Equation 7 should not depend on how we conceptually divide the stimulus into independently varying elements

*U*and

_{i}*V*. In particular, our predictions concerning the effects of selective attention should not change if we reformulate our model so that two elements

_{j}*V*and

_{1}*V*with likelihood ratios

_{2}*ν*and

_{1}*ν*are now regarded as a single element

_{2}*ν*

_{1}*ν*with likelihood ratio

_{2}*ν*

_{1}*ν*. It follows that The theory of functional equations (Falmagne, 1985) shows that Equation 8 implies that

_{2}*f*is a power function, Hence, a reasonable guess for the form of selective attention is The corresponding log likelihood ratio is If

*k*=0, all likelihood ratios

*ν*are mapped to 1, and the distractor elements

_{j}*ν*have no effect on the observer’s responses. If

_{j}*k*=1, the likelihood ratios

*ν*are unaffected, and

_{j}*ν*have their full effect. Note the similarity of Equation 12 to Equation 1, where we defined

_{j}*k*as the attentional weight assigned to the distractors.1

*k*=

*k*, then the likelihood ratio in Equation 10 exceeds 1 if and only if the likelihood ratio in Equation 14 exceeds 1, so an unbiased observer would give the same response regardless of which expression that he used. Hence, for an unbiased observer, we can assume that selective attention affects only the likelihood ratios corresponding to unattended stimuli. If an observer is biased (i.e., adopts a likelihood ratio criterion different from 1), then models (10) and (14) are not equivalent, and we might be able to compare these models experimentally by persuading the observer to use an extreme criterion. Here we do not consider the case of a biased observer.2

_{2}/k_{1}*s*, is equal to the target displacement, which we will call

*T*. An unbiased observer of this type responds “right” if

*s*is greater than zero, and “left” if

*s*is less than zero. This strategy can be represented as a vertical decision line that divides the decision space in two (e.g., the red line in Figure 1). On the other hand, if the observer cannot selectively attend to the target dots, his responses will be based on some combination of the total horizontal target displacement

*T*and the total horizontal distractor displacement, which we will call

*D*. As in Equation 12, we will model the observer’s decision variable

*s*as a weighted sum of the internal responses to the target and distractor dots: The attentional weight

*k*assigned to the distractor dots determines the influence of the distractors on the observer’s responses. For an observer for whom

*k*≠ 0, the decision line is not vertical, but rather has slope −1/

*k*(e.g., the blue line in Figure 1).

*k*in Equation 15 is not just an arbitrary free parameter, but that it is actually the attentional weight that our hypothetical observer assigns to the distractors. Hence Equation 15 results from a direct application of our account of attentional weight to the task of judging the direction of target dots mixed with distractor dots in a random dot cinematogram.

*M*shows the mean total horizontal displacement of the target and distractor dots over all trials where the target signal dots move right, indicating that on average the target dots move to the right and the distractor dots have zero displacement. The green dot

*M*shows the mean target and distractor displacements over all trials where the target moves right and the observer responds “right.” As indicated by the dashed line in Figure 2, this conditional mean is shifted from the unconditional mean along a line that is perpendicular to the decision line. This follows from the fact that the distribution of target and distractor displacements is radially symmetric: the part of the distribution that falls on one side of the decision line is mirror-symmetric about a line that is perpendicular to the decision line and passes through

_{R}*M*, so the mean over all trials where the observer responds “right” must lie along this line. Similarly, the mean displacement

*M*over all trials where the target moves right and the observer responds “left” is shifted from the overall mean along the same line in the opposite direction, as indicated by the large red dot. The slope of the decision line is −1/

_{L}*k*, so the slope of the perpendicular line connecting the two conditional means is

*k*, the attentional weight that the observer assigns to the distractor dots.

*C*represent the correct response on a given trial, taking the value +1 or −1 on trials where the correct response is “right” or “left,” respectively. Similarly, let the random variable

*R*represent the observer’s responses, taking the value +1 or −1 on trials where the observer responds “right” or “left,” respectively. With this notation, the coordinates of the conditional mean displacements

*M*and

_{R}*M*are and we have just shown that the slope of the line connecting these points is

_{L}*k*: We can obtain a second, independent estimate of

*k*by using Equation 17 with

*C*=−1 (i.e., finding the slope of the line connecting the conditional means over all trials where the correct answer is “left”).

*k*from measurable quantities, but a reformulation makes the meaning of this expression much clearer. First, note that the coordinates of

*M*and

_{R}*M*with respect to an origin

_{L}*M*=(

*μ*,

_{T}*μ*at the mean of the distribution of target and distractor displacements are and, of course, we obtain the same value of

_{D}*k*if we calculate the slope of the connecting line in this coordinate frame. Second, because

*M*′,

_{R}*M*′, and

_{L}*M*are collinear, we obtain the same value for

*k*if we multiply

*M*′ by

_{R}*P*(

*R*= 1)(1 −

*μ*) and multiply

_{R}*M*′ by −

_{L}*P*(

*R*= −1)(−1 −

*μ*), where

_{R}*μ*=

_{R}*E*[

*R*]. These transformations convert Equation 17 into a ratio of covariances: Hence to find the attentional weight that the observer assigns to the distractor dots, we can measure the covariance between the total horizontal target and distractor dot displacements and the observer’s responses, over all trials where the correct answer is “right,” and take the ratio of these two covariances. That is, the attentional weight is equal to the influence of the distractor dots on the observer’s responses, as a proportion of the influence of the target dots on the observer’s responses.

*T, D*) is radially symmetric over all trials where the target dots move in a given direction. The random variables

*T*and

*D*are independent, so this is true only if they are Gaussian and their variances are equal. Both

*T*and

*D*are the sum of many horizontal dot displacements, so the central limit theorem ensures that they will be approximately Gaussian. However, in the random dot cinematograms in the experiments we report below, there are an equal number of target and distractor dots, and a small number of target dots always move in a given direction, so there are slightly fewer randomly moving target dots than randomly moving distractor dots. Consequently, the variance of

*T*is actually slightly less than the variance of

*D*. In , we show that we can correct for this difference by adjusting

*k*by a factor (

*N*−

*n*)/

_{T}*N*, where

*N*is the total number of target dots, and

*n*is the number of target dots that move directly left or right. The corrected expressions are When the coherently moving target dots make up only a small proportion of the dots in the cinematogram, as is usual, this correction is negligible compared to experimental error.

_{T}*s*=

*T*+

*kD*. This allowed us to calculate the exact value of the random variables

*T*and

*D*on each trial, directly from the stimulus. With this information, we were able to locate each trial in the observer’s decision space, as in Figure 2, and recover the attentional weight

*k*by finding the slope of the line connecting the mean internal responses over all trials where the observer responded “left” or “right.” However, real observers’ decision variables are certainly not

*s*=

*T*+

*kD*. First of all, real observers have internal noise, and, second, observers might compute some quantity other than the horizontal displacement of the target and distractor dots (e.g., an observer might count the number of dots that move directly to the left or right, or monitor the activation of 30°-wide motion channels). This seems to pose a problem for our method of measuring attentional weight, as this method apparently relies on our knowing the observer’s internal responses to the target and distractor dots on every trial.

*T** computed from the target dots and a quantity

*D** computed from the distractor dots: Second, we assume that

*T** and

*D** are computed by summing responses to individual target and distractor dot displacements, and that the observer has the same selectivity

*f*for target and distractor dot displacements. We also assume that

*T** and

*D** are contaminated by independent, equal-variance internal noise sources

*Z*and

_{D}*Z*. Thus we can write the internal responses

_{D}*T** and

*D** as Here

*t*and

_{i}*d*are random variables, perhaps multidimensional, that describe the relevant properties of individual target and distractor dot displacements, respectively. For instance, to describe an observer who performs the direction discrimination task using 30°-wide motion channels, but is less affected by dots at greater eccentricities, the random variables

_{i}*t*and

_{i}*d*would report the direction and eccentricity of each dot displacement, and the function

_{i}*f*would describe the observer’s selectivity to dots in each direction, at each eccentricity. Such noisy linear-filter models have been found to give a good account of global motion perception under a wide range of conditions (Watamaniuk, 1993; Zohary et al., 1996).

*d*′ versus the number of signal dots. Let

*f*and

_{R}*f*be the mean value of

_{L}*f*(

*t*) when

_{i}*t*is a dot that steps directly to the right or to the left, respectively. If an observer can perform the direction discrimination task at all, then

_{i}*f*≠

_{R}*f*, and in a task with

_{L}*n*target signal dots, the difference in the mean of

_{T}*T** when the dots move to the right or to the left is

*n*(

_{T}*f*−

_{R}*f*). Furthermore, if

_{L}*n*is much smaller than the total number of dots in the cinematogram, then the variance

_{T}*σ*of the observer’s decision variable is largely independent of

_{s}^{2}*n*. Consequently, the observer’s sensitivity is

_{T}*d*′=

*n*(

_{T}*f*−

_{R}*f*)/

_{L}*σ*, indicating that the psychometric function is linear when plotted as

_{s}*d*′ versus the number of signal dots. In Experiment 1, we measured psychometric functions in a global direction discrimination task to test the linearity assumption implicit in this model.

*D** from the distractors, regardless of whether the distractors are fully attended (

*k*=1) or partially or completely unattended (

*k*<1); selective attention merely modulates the influence of this internal response on the decision variable. In other words, this account implies that selective attention does not qualitatively change how the observer processes the distractors, but only attenuates the influence that the distractors have on the observer’s responses. Of course, we cannot know a priori whether this is true of human observers, and it may be that in some tasks, processing of attended and unattended stimuli is qualitatively different. For instance, it may be that when observers judge global direction of motion in random dot cinematograms, the directional selectivity of motion channels is different for attended and partly unattended dots. Accordingly, we cannot be certain that attentional weight is an appropriate measure of selective attention until we compare how observers process attended and unattended stimuli.

*f*(

*θ*), then we can estimate the directional selectivity function

*f*(

*θ*) by measuring the influence that each dot moving in direction

*θ*has on the observer’s responses. Specifically, we show that the conditional probability that an observer responds “right” when an arbitrarily chosen dot moves in direction

*θ*is related to the directional selectivity function

*f*(

*θ*) as follows: where

*u*and

*ν*are constants. In Experiment 1, we used the HCA method to compare direction selectivity for attended and unattended dots in a global direction discrimination task.

*k*=1. Finally, we used the HCA method developed by Chubb et al. (1999) to measure directional selectivity for target and distractor dots, to see whether selective attention led to qualitative differences in processing of targets and distractors, or merely reduced the influence of distractors on observers’ responses.

*c*= (

_{W}*L*−

*L*)/

_{bg}*L*, where

_{bg}*L*is the luminance of the point of interest, and

*L*is background luminance. The stimuli were shown on a gray background of luminance 40 cd/m

_{bg}^{2}.

Mean target displacement (deg) | Mean distractor displacement (deg) | ||||||||
---|---|---|---|---|---|---|---|---|---|

Target R | Target L | Target R | Target L | ||||||

Response R | Response L | Response R | Response L | Response R | Response L | Response R | Response L | ||

A.N.C. | 2.364 | 2.263 | −2.277 | −2.389 | 0.022 | −0.087 | 0.083 | −0.047 | |

50L50L | C.P.T. | 1.786 | 1.679 | −1.658 | −1.795 | 0.033 | −0.081 | 0.093 | −0.038 |

R.F.M. | 0.688 | 0.453 | −0.417 | −0.682 | 0.104 | −0.142 | 0.182 | −0.089 | |

A.N.C. | 2.364 | 2.277 | −2.284 | −2.376 | 0.024 | −0.074 | 0.077 | −0.025 | |

50L50D | C.P.T. | 1.794 | 1.640 | −1.680 | −1.799 | 0.017 | −0.094 | 0.038 | −0.011 |

R.F.M. | 0.691 | 0.441 | −0.420 | −0.672 | 0.084 | −0.124 | 0.179 | −0.054 |

This table shows the mean total rightward displacement of the target and distractor dots, conditional on the target signal dots moving left or right and the observer responding “left” or ”right.” For example, the top left entry shows that for observer A.N.C. in the 50L50L condition, the average total target dot displacement was 2.364 deg to the right on trials where the target signal dots moved right and the observer responded ”right.” The values in this table are the coordinates of the conditional mean displacements shown in Figures 5 and 6 as red and green dots. Note that both the target and the distractor displacements were correlated with observers’ responses: all mean displacements were further to the right when observers responded ”right” than when observers responded “left.” This was true even in the 50L50D condition, where observers tried to ignore the distractor dots.

*k*=0.95 ± 0.16, for C.P.T.,

*k*=0.87 ± 0.19, and for R.F.M.,

*k*=0.99 ± 0.09. The error values are SEs. None of these estimates of

*k*is significantly different from the anticipated value of 1 (

*p*>.40 for all comparisons in a two-tailed test). The slanted blue lines in Figure 5 show the decision lines,

*T*+

*kD*= 0, corresponding to these values of

*k*.

*k*=0.93 ± 0.20, for C.P.T.,

*k*=0.52 ± 0.15, and for R.F.M.,

*k*=0.84 ± 0.09. All these estimates of

*k*are significantly greater than zero (

*p*< .001 for all comparisons), none is significantly less than the observer’s corresponding value in the 50L50L condition (

*p*>. 10 for all comparisons), and only C.P.T.’s is significantly less than 1 (

*p*< .01).

*misperceive*distractors as targets. At a Weber contrast of ±40%, the targets and distractors are highly discriminable. Rather, these results show that observers cannot make global direction judgments based solely on the directions of white target dots, in the presence of black distractor dots.

*θ*, and we averaged these direction selectivity functions separately over all target dots and over all distractor dots. Figure 7 shows the influence of target dots and distractor dots on the observers’ responses, as a function of dot direction, averaged across all three observers. The directional tuning was approximately sinusoidal for both attended and unattended dots, varying as cos

*θ*, indicating that observers based their responses on the horizontal displacements of both target and distractor dots. Evidently observers processed attended and unattended stimuli in the same way, at least in terms of their directional selectivity. Furthermore, the best-fitting sinusoids had slightly different amplitudes, reflecting the fact that unattended dots had less overall influence on the observer’s responses. These results support the notion that observers have the same selectivity for attended and unattended stimulus elements, and that selective attention operates by uniformly reducing the influence of distractor elements on observers’ responses.

*c*= (

_{M}*L*

_{max}−

*L*

_{min})/(

*L*

_{max}+

*L*

_{min}), where

*L*

_{max}is the maximum luminance in the region of interest and

*L*

_{min}is the minimum luminance.) A mismatch like this might lead us to underestimate observers’ abilities to direct attention according to contrast polarity, as the black dots might evoke a stronger response in motion channels than the white dots, and therefore be more difficult to ignore than black dots with properly equated contrast magnitudes.

*s*=

*T**+

*kD**, described by Equations 24 though 26. We will derive expressions that show how attentional weight is related to the probability that the observer responds “right,” depending on whether the target and distractor signal dots move left or right.

*T**. Let

*μ*be the expected value of

_{T}*T** over all trials, and let Δ

*μ*be the difference in the expected value of

_{T}*T** between trials where the target signal dots move right and trials where they move left. There are an equal number of signal-left and signal-right trials, so the overall mean

*μ*lies midway between the means over signal-left trials and signal-right trials, and we can write the mean of

_{T*}*T** as

*μ*± 0.5Δ

_{T*}*μ*, where the sign depends on whether the target signal dots move right or left. Furthermore, the variance of

_{T}*T** is the same over trials where the target signal dots move right and trials where they move left, because the signal dot displacements are constant within the signal-left and signal-right classes of trials, and do not contribute to the variance. We will denote the variance of

*T** on signal-left or signal-right trials as

*σ*. For later convenience, we define

^{2}_{T*}*d*′

_{T*}= Δ

*μ*/

_{T*}*σ*, which is the sensitivity of

_{T*}*T** to the difference between signal-left and signal-right trials.

*D**. Just as with

*T**, we can write the mean of

*D** as

*μ*± 0.5Δ

_{D*}*μ*, where the sign depends on whether the distractor signal dots move left or right. Also, the variance of

_{D*}*D** is the same regardless of whether the distractor signal dots move left or right, and we will denote this variance by

*σ*.

^{2}_{D*}*T** and

*D** are related. Both

*T** and

*D** are calculated by summing internal responses to individual dot displacements, so Δ

*μ*and Δ

_{T*}*μ*are proportional to the number of target and distractor signal dots, respectively. In the task we are considering (Figure 8), there are twice as many target signal dots as distractor signal dots, so Δ

_{D*}*μ*= 2Δ

_{T*}*μ*. Furthermore, there are approximately the same number of target noise dots as distractor noise dots, so the variances of

_{D*}*T** and

*D** are approximately equal:

*σ*≈

^{2}_{T*}*σ*. (We will return to this approximation shortly.)

^{2}_{D*}*s*=

*T** +

*kD**. The mean of

*s*is , where the signs depend on whether the target and distractor signal dots move left or right. In this task, , so the mean of

*s*is . The variance of

*s*is , and to a close approximation , so we can rewrite the variance as (1+

*K*

^{2})

*σ*

^{2}

_{T*}. The midpoint of the distribution of

*s*is

*μ*+

_{T*}*kμ*, which is therefore the response criterion of an unbiased observer.

_{D*}*k*

^{2})

*σ*

^{2}

_{T*}. Hence on an RR trial, the probability that the decision variable exceeds the observer’s criterion, and the observer responds “right,” is Here

*G*(

*x, μ, σ*) is the normal cumulative distribution function, and when we omit arguments

*μ*and

*σ*, they default to 0 and 1, respectively.

*RR*trials. Hence the probability of the observer responding “right” on an

*RL*trial is Similarly, the probabilities of the observer responding “right” when the target moves to the left and the distractor moves to the left (

*p*) or to the right (

_{LL}*p*) are

_{LR}*k*and

*d*′

_{T}as a function of the conditional response probabilities, and solve Equations 32 and 33 to give another independent estimate. However, when analyzing data from simulated model observers with known values of attentional weight, we have found the estimates of

*k*and

*d*′

_{T}to be less variable when we solve all four equations simultaneously, using a simplex search to find the values of

*k*and

*d*′

_{T}that minimize the sum-of-squares error between the left- and right-hand sides of Equations 30 through 33. This is the method that we recommend, so we will not derive explicit expressions for

*k*and

*d*′

_{T}as a function of the conditional response probabilities.

*T** and

*D** are equal, . In fact, the variance of

*T** is slightly less than the variance of

*D** in our experiments, because there are more target signal dots than distractor signal dots, and hence fewer target noise dots than distractor noise dots. We could derive exact expressions for

*k*and

*d*′

_{T}that do not use this assumption, but we will not do so for two reasons. First, the approximation is very accurate. In the following experiment, the target distribution had on average only three or four more signal dots per frame than the distractor distribution, so the numbers of target and distractor noise dots were approximately equal, and the bias introduced by this approximation is small compared to experimental error. Second, we used stimuli with different numbers of target and distractor signal dots only in order to make our stimuli as similar as possible to those in earlier studies (e.g., Edwards & Badcock, 1994), and it would be easy to do away with the approximation simply by using an equal number of target and distractor noise dots. In any case, in a task where this approximation is inadequate, it should be clear how Equations 30 through 33 could be rederived without the approximation.

*k*and

*d*′

_{T}that we calculated from these conditional response probabilities, using a simplex search to find the values of

*k*and

*d*′

_{T}that minimized the sum-of-squares error between the left- and right-hand sides of Equations 30through 33. The estimates of

*k*ranged from 0.86 to 1.12, and the mean estimate across observers was 0.99 ± 0.04. Neither the individual estimates nor the mean estimate were significantly different from the anticipated value of 1 (

*p*>.20 for all comparisons).

Distractor R | Distractor L | Attentional weight k | Sensitivity d′_{T} | ||
---|---|---|---|---|---|

A.J.R. | Target R | 0.75 | 0.55 | 1.12 ± 0.14 | 1.44 ± 0.11 |

Target L | 0.38 | 0.21 | |||

K.E.H. | Target R | 0.83 | 0.67 | 0.86 ± 0.11 | 1.59 ± 0.10 |

Target L | 0.41 | 0.22 | |||

R.F.M. | Target R | 0.86 | 0.65 | 0.98 ± 0.08 | 2.02 ± 0.10 |

Target L | 0.36 | 0.14 | |||

S.A.K. | Target R | 0.60 | 0.54 | 0.95 ± 0.27 | 0.66 ± 0.09 |

Target L | 0.44 | 0.32 | |||

T.F.S. | Target R | 0.71 | 0.59 | 1.02 ± 0.16 | 1.18 ± 0.10 |

Target L | 0.43 | 0.24 |

The first two columns of numbers show the proportion of trials on which the observer responded “right,” conditional on the target and distractor signal dots moving left or right. for example, the top left cell shows that observer A.J.R. responded “right” on 75% of the trials on which both the target and the distractor signal dots moved to the right. The third and fourth columns show the attentional weight *k* and the target sensitivity *d*′_{T} calculated from these conditional response probabilities using the methods described in the text. The error values are SEs.

*d*′

_{T}ranged from 0.66 to 2.02. Although we chose the number of signal dots to maintain 70% correct performance based on a pilot session, some observers performed markedly better or worse than this in the main experiment. This is reflected in the wide range of values of

*d*′

_{T}.

*k*and target sensitivity

*d*′

_{T}, in the 50L50D condition. The estimates of

*k*ranged from 0.22 to 0.73, and the mean estimate across observers was 0.52 ± 0.09. Each individual estimate of

*k*, as well as the mean estimate, was significantly greater than zero and significantly less than 1 (

*p*< .01 for all comparisons), and significantly less than the corresponding value of

*k*in the 50L50L condition (

*p*< .05 for all comparisons). Despite individual differences, all observers had a limited ability to direct attention according to contrast polarity: all were appreciably influenced by opposite-polarity distractors, but not as much as by same-polarity distractors.

Distractor R | Distractor L | Attentional weight k | Sensitivity d′_{T}′ | ||
---|---|---|---|---|---|

A.J.R. | Target R | 0.78 | 0.65 | 0.62 ± 0.10 | 1.60 ± 0.09 |

Target L | 0.29 | 0.15 | |||

K.E.H. | Target R | 0.78 | 0.70 | 0.48 ± 0.10 | 1.48 ± 0.08 |

Target L | 0.31 | 0.19 | |||

R.F.M. | Target R | 0.87 | 0.68 | 0.73 ± 0.08 | 1.96 ± 0.10 |

Target L | 0.30 | 0.15 | |||

S.A.K. | Target R | 0.87 | 0.82 | 0.22 ± 0.08 | 1.80 ± 0.08 |

Target L | 0.26 | 0.20 | |||

T.F.S. | Target R | 0.80 | 0.66 | 0.54 ± 0.10 | 1.42 ± 0.08 |

Target L | 0.31 | 0.23 |

See notes for Table 2.

*k*that we obtained in this experiment were at least as small as the SEs obtained in Experiment 1, even though we collected fewer than one third as many trials per observer in this experiment. When we applied the sampling noise method used in Experiment 1 to this experiment’s data, calculating the covariance ratio within each of the four clusters of trials (RR, RL, LR, and LL) and averaging the resulting four estimates of

*k*, we found that the SEs were at least twice as large as the SEs obtained with the probe method, and in one half the cases were greater than 1, rendering the estimates of

*k*practically useless. Clearly, the probe method is a more efficient way of measuring the attentional weight that observers assign to distractors.

*k*and the target sensitivity

*d*′

_{T}are the same on all trials, regardless of whether the distractor signal is in the same direction or the opposite direction as the target signal. This assumption is reasonable, but it could be false. It could be that when the target and distractor dots move in opposite directions, the observer perceives two transparent sheets of dots sliding over one another, and that this perceptual segregation helps the observer ignore the distractors. If this were so, the attentional weight would be lower on opposite-direction trials. In contrast, the sampling noise method used in Experiment 1 relies on small statistical variations in the stimulus, making it unlikely that the observer’s decision rule changes systematically from one stimulus to the next. To state the problem more generally, the probe method introduces larger variations into the stimulus, and this makes the method faster but also relies on the assumption that these variations do not change the observer’s decision rule. This is the reason we present both methods here, even though the more efficient probe method is to be preferred in tasks where its assumptions are met. In our experiments, we obtained similar results with both methods, indicating that the slightly stronger assumptions of the probe method were at least approximately satisfied.

*μ*, and the orientations of the black blobs were normally distributed with mean

_{W}*μ*. The SDs of the white and black orientation distributions were the same. On each trial, the mean orientations

_{B}*μ*and

_{W}*μ*were randomly and independently set to a small, fixed angle

_{B}*μ*clockwise or counterclockwise of vertical. In the White condition, observers judged whether the mean orientation of the white blobs was clockwise or counterclockwise of vertical, and in the Black condition, observers judged whether the mean orientation of the black blobs was clockwise or counterclockwise of vertical.

*s*=

*T** +

*kD**, and in this task and . Proceeding exactly as in the introduction to Experiment 2 (“The Probe Method of Measuring Attentional Weight”), we can write the probability of the observer responding clockwise when both the target and the distractor are clockwise of vertical as Similarly, on trials where the target is clockwise and the distractor is counterclockwise, the probability of a clockwise response is The same analysis applies to trials where the target distribution is counterclockwise of vertical: As in Experiment 2, we used a simplex search to find the best-fitting values of

*k*and

*d*′

_{T}, given the measured response probabilities.

*g*is the normal probability density function, and the scale constants were

*σ*deg and

_{W}*σ*=0.12 deg. Fifty of the blobs were white, and 50 were black (peak Weber contrast ±0.40). The orientations of the white blobs were normally distributed with a mean

_{L}*μ*° clockwise or counterclockwise of vertical, and a SD of 5°. The orientations of the black blobs were also normally distributed with a mean

_{W}*gm*° clockwise or counterclockwise of vertical, and a SD of 5°. The mean orientations of the white and black subsets were randomly and independently set to

_{W}*μ*° clockwise or counterclockwise of vertical on each trial. The mean angle

*μ*° was chosen individually for each observer, as explained in “Procedure.” The stimuli were shown on a gray background of luminance 40 cd/m

^{2}. The stimulus duration was 200 ms.

*μ*from vertical was fixed at a value found during a pilot session to give approximately 70% correct performance. For observers J.A.P. and L.C.S., this was 2.5°, and for S.U.M., it was 2.0°. Over the course of three sessions, each observer ran in approximately 1,100 trials in each condition (White and Black).

*k*and target sensitivity

*d*′

_{T}, in both White and Black conditions. Estimates of

*k*ranged from 0.13 to 0.43, and the average estimate across observers and conditions was 0.27. Each estimate of

*k*was significantly greater than zero and significantly less than 1 (

*p*<. 05 in all comparisons), and the estimates were not significantly different across the White and Black conditions, although the difference across conditions approached significance for observer J.A.P.

Distractor CW | Distractor CCW | Attentional weight k | Sensitivity d′_{T} | ||
---|---|---|---|---|---|

J.A.P. | Target CW | 0.81 | 0.64 | 0.43 ± 0.07 | 1.39 ± 0.09 |

Target CCW | 0.36 | 0.17 | |||

L.C.S. | Target CW | 0.64 | 0.58 | 0.18 ± 0.07 | 1.25 ± 0.08 |

Target CCW | 0.21 | 0.14 | |||

S.U.M. | Target CW | 0.79 | 0.69 | 0.33 ± 0.09 | 0.99 ± 0.08 |

Target CCW | 0.45 | 0.33 |

CW = clockwise; CCW = counterclockwise. See caption of Table 2 for details.

Distractor CW | Distractor CCW | Attentional weight k | Sensitivity d′_{T} | ||
---|---|---|---|---|---|

J.A.P. | Target CW | 0.74 | 0.67 | 0.24 ± 0.07 | 1.28 ± 0.09 |

Target CCW | 0.31 | 0.18 | |||

L.C.S. | Target CW | 0.65 | 0.60 | 0.13 ± 0.06 | 1.19 ± 0.08 |

Target CCW | 0.22 | 0.17 | |||

S.U.M. | Target CW | 0.81 | 0.67 | 0.28 ± 0.09 | 0.98 ± 0.09 |

Target CCW | 0.41 | 0.37 |

CW = clockwise; CCW = counterclockwise. See caption of Table 2 for details.

*k*suited to this new task. So long as our model of the observer’s decision variable is correct, the measured value of

*k*will be the same in both experiments. This is not true of many other measures of selective attention, such as the difference in reaction times between trials where the response suggested by distractors is consistent or inconsistent with the response suggested by targets (e.g., Garner & Felfoldy, 1970), or the difference between direction discrimination thresholds when distractors have the same polarity or the opposite polarity as the targets (e.g., Edwards & Badcock, 1994). In this sense, attentional weight is like the signal detection theory measure of sensitivity,

*d*′: it attempts to measure a property that is invariant across experimental designs and perceptual tasks, and hence is truly a characteristic of the observer. Consequently, we need to use different expressions for measuring attentional weight in different tasks, just as we use different expressions for

*d*′ (Macmillan & Creelman, 1991). The advantage, however, is that we can meaningfully compare the efficacy of selective attention across different tasks.

*selective*attention (i.e., an observer’s ability to judge selected stimuli in a visual scene and to ignore others). As Kinchla (1974; Kinchla & Collyer, 1974) has shown, attentional weight is also a useful measure in tasks involving distributed attention, such as cued detection tasks and visual search. On the other hand, we can see no obvious way of using attentional weight, as we have defined it, to study some other tasks normally thought to involve attention, such as dual tasks. Furthermore, as we have emphasized, attentional weight is an appropriate measure only if attending to or away from a stimulus does not qualitatively change how an observer processes the stimulus. For instance, if observers had very different directional selectivities for attended and unattended dots in random dot cinematograms, as well as weighting attended and unattended stimuli differently, then a scalar measure would be inadequate. We would need a more flexible approach (e.g., we could measure the directional selectivity for attended and unattended stimuli using the HCA method demonstrated in Experiment 1). In this respect, our account of selective attention is similar to recent accounts of visual search (Eckstein, 1998; Palmer, 1994; Shaw, 1984), in that it implies that selective attention does not change the observer’s representation of a stimulus.

*T** and

*D**, would have the same sensitivity

*d*′ to the target and distractor discrimination tasks. In a Stroop task where both word identity and ink color are randomly chosen on each trial, and the observer reports, say, the word identity, the weighted sum hypothesis holds that the observer’s decision variable is

*s*=

*T** +

*kD**. The problem of measuring attentional weight in this Stroop task is exactly the same as the problem we faced in Experiment 3: the targets and distractors are equally discriminable, and the observer uses a decision variable of form 24. Hence, we could simply use Equations 36 through 39 that we used in Experiment 3 to measure the attentional weight assigned to distractors in this appropriately constructed Stroop task.6

*s*=

*T** in the no-distractor condition and a decision variable

*s*=

*T** +

*kD** in the distractor condition, as in Equations 24 through 26, then the only ways that selective attention can fail are for the observer to assign a nonzero attentional weight to the distractors, or for the target component

*T** of the decision variable to be computed less efficiently in the distractor condition than in the no-distractor condition. Furthermore, the only ways that

*T** may be computed less efficiently in the distractor condition are either for the selectivity function

*f*to be less efficient or for the internal noise

*Z*to be higher. Thus we obtain a simple taxonomy of the ways selective attention may fail: attentional weight may be assigned to distractors, target selectivity may be impaired by distractors, and internal noise may be increased by distractors.

_{T}*complete*effect that distractors have on performance, including both the correlation they have with the observer’s responses, and also any performance decrements they cause by other means.

*attentional weight*, that measures the relative influence of targets and distractors on an observer’s responses. We have presented two methods for estimating attentional weight by measuring the influence that targets and distractors have on an observer’s responses. In three experiments, we showed that these methods give a description of observers’ abilities to direct attention according to contrast polarity that is consistent both with our prior expectations, as in the same-polarity condition (50L50L) where we found an attentional weight of

*k*= 1, and with previous empirical results, as in the opposite-polarity conditions (50L50D, White, and Black) where we found that observers had only a limited ability to direct attention according to contrast polarity. Furthermore, we found that selectivity was the same for attended and unattended stimuli in the direction discrimination tasks, as it must be if attentional weight is to be a valid measure of selective attention. Finally, we have shown that the weighted sum framework can be used to study selective attention in a broad range of tasks, and we have suggested how it could be extended to give a more thorough characterization of how selective attention affects performance.

*Journal of Vision*section editor and two anonymous reviewers for their suggestions. This research was supported by National Science and Engineering Research Council Grants OGP0105494 and OGP0042133. Commercial Relationships: None.

^{1}We introduced Equation 8 as a constraint, on the grounds that the effect of the modulating function

*f*should not depend on how we conceptually divide a stimulus into independently varying elements. Alternatively, we can view Equation 8 as claiming that selective attention

*uniformly*attenuates the influence of distractors. An example of a nonuniform transformation of likelihoods is a robust statistical calculation, which limits the influence of highly unlikely events (e.g., in a typical robust statistical calculation a single event of probability 10

^{−5}has less influence than five events of probability 10

^{−1}:

*f*(10

^{−5}<

*f*(10

^{−1})

^{5}). Equation 8 states that selective attention does not incorporate a robustness transformation, or a transformation of the opposite type that emphasizes highly unlikely events, but instead uniformly reduces the influence of all distractors. The only functions that satisfy this constraint are the power functions, suggesting that selective attention takes the form of a single weighting factor, as in Equation 12.

^{2}The question of whether selective attention affects the likelihood ratios corresponding to targets, distractors, or both, superficially recalls the question of whether selective attention operates by amplifying attended stimuli, inhibiting unattended stimuli, or both (e.g., James, 1890/1950; Tipper & Driver, 1988). However, any mechanism that affects the influence of a stimulus element on an observer’s responses can be seen as adjusting the likelihood ratios of a Bayesian decision-maker, so our account is agnostic as to whether evidence is reweighted by amplification, by inhibition, or by some other mechanism.

^{3}For Edwards and Badock’s (1994) cinematograms, and for the cinematograms in our experiments, total horizontal displacement is

*not*the ideal decision variable: only the number of dots that move directly to the left or right is informative as to the correct response (see “Methods” section of Experiment 1 for details), whereas a decision variable based on total horizontal displacement allows dots that move at oblique angles to influence the observer’s responses. However, as explained earlier, just because we use a Bayesian framework, we need not assume that observers use an ideal decision rule. Furthermore, channels for perception of global motion are quite broadly tuned (see results of Experiment 1, as well as Williams et al., 1991), so we will use total horizontal displacement as a plausible decision variable. As we have said, though, we make this assumption only to make the exposition more concrete, and we will show that our results do not depend on this assumption.

^{4}In principle, observers could partly distinguish between target dots and distractor dots by counting the number of steps each dot took directly to the left or to the right; dots with more steps directly to the left or to the right would be more likely to be target dots, and so their horizontal displacements could be weighted more heavily into the decision variable. However, this seems an implausibly complex strategy, and in any case, our results indicate that observers do weight the distractor dots as heavily as the target dots when they have the same contrast polarity.

^{5}Instead of considering these three types of interference as confounds, we could say that observers subject to such interference are simply unable to attend to positive-contrast target dots. However, this would obscure an important distinction: in all three types of interference, the

*directions*of distractors do not influence observers’ left-right responses, and so it is only in a very weak sense that such observers could be said to be attending to the distractors. Later (“A More General Model”) we discuss the question of how to determine whether the distractors interfere with performance, without actually being correlated with observers’ responses.

^{6}In this analysis, we have glossed over a problem concerning the relative scale of

*T** and

*D**. To derive Equations 36 through 39 in Experiment 3, we used not only the fact that

*T** and

*D** had equal sensitivity

*d*′, but also that

*T** and

*D** had equal means and standard deviations. This was justified in Experiment 3, because the target and distractor stimuli were identical except for their contrast polarity, and we explicitly assumed that the observer performed the same computation on targets and distractors. In general, though, it is difficult to assign an absolute scale to decision variables, and for convenience, we often assume that a decision variable has standard deviation 1 (e.g., Green & Swets, 1974). For instance, if the color words and ink colors in our Stroop task are equally discriminable, we know that the corresponding decision variables have equal sensitivity

*d*′=

*μ/σ*, but it is difficult to rule out the possibility that the color word decision variable has mean

*μ*and standard deviation

*σ*, and that the ink color decision variable has mean

*2μ*and standard deviation

*2σ*. For this reason, when describing the Stroop task decision variables, we implicitly used the common modeling assumption that

*σ*=

*1*. With this assumption, the equal discriminability of targets and distractors implies that

*T** and

*D** have equal means and standard deviations.

*any*observer whose decision variable has these properties. In particular, it should be clear that this method is not restricted to tasks involving random dot cinematograms, but can be used to study how an observer combines information from any two sources using a decision rule with the stated properties.

*N*target dots and

_{T}*N*distractor dots, that

_{D}*n*target signal dots move directly left or right, that

_{T}*n*distractor signal dots move directly left or right, and that the remainder of the dots move in random directions. (The stimuli in Experiment 1 had an equal number of target and distractor dots,

_{D}*N*=

_{T}*N*, and had no distractor signal dots,

_{D}*n*=0.) We will represent the cinematogram by a collection of multivariate random variables

_{D}*t*and

_{i}*d*that represent individual target and distractor dot displacements, respectively. Each

_{i}*t*and

_{i}*d*encodes any properties of the dot displacements that are relevant to the observer’s responses. For instance, we could represent each dot displacement by a triplet (

_{i}*θ,x,y*) that reports its direction

*θ*and its position (

*x,y*). For convenience, we assume that the indices are ordered so that

*t*

_{1}…,

*t*represent the

_{nT}*n*target signal dots, and

_{T}*d*

_{1}…,

*d*represent the

_{nD}*n*distractor signal dots. We assume that and , corresponding to the noise dots, are identically distributed. Properties of the target dots may be correlated (cov[

_{D}*t*] ≠ 0), as may properties of the distractor dots (cov[

_{j}, t_{j}*d*] ≠ 0), but we assume that the target and distractor dot distributions are independent (cov[

_{j}, d_{j}*t*] ≠ 0).

_{j}, d_{j}*g*(

*t*) and

_{i}*g*(

*d*) be the horizontal component of dot displacements

_{i}*t*and

_{i}*d*, respectively Then, the total horizontal target and distractor displacements are The expected value of the total horizontal target displacement is , and the expected value of the total horizontal distractor displacement is , where Δ

_{i}*d*is the size of a single dot step and the signs depend on whether the signal dots move left or right.

*T** and

*D**: Second, because

*T** is calculated from the target dots and

*D** is calculated from the distractor dots,

*T** is uncorrelated with

*D*, and

*D** is uncorrelated with

*T*: Third, because

*T** and

*D** are obtained from the target and distractor dots using the same calculation, their covariances are approximately related by For later convenience, we will rewrite A5 as where

*c*is a constant. To prove Equation A6 and A7, we simply evaluate the covariances: We can drop the unvarying signal dots from the sum, and write the covariances as . Here we have defined and , where

*i*≠

*j*.

*c*term is the dominant term in Equations A13 and A15, as

_{1}*d*measures how strongly the effect of a single randomly chosen noise dot displacement on

_{1}*T*is correlated with the effect of the same dot displacement on

*T**, and this correlation may be large. On the other hand,

*c*

_{0}measures how strongly the effect of a single randomly chosen noise dot displacement on

*T*is correlated with the effect of a

*different*randomly chosen noise dot displacement on

*T**, and for any reasonably large cinematogram, this correlation is negligible. The correlation need not be zero, because some properties of different noise dots displacements may be correlated (e.g., in our cinematograms, the lifetime of each dot was eight frames, so the positions of two randomly chosen dots on different frames were weakly correlated). (In fact, because the directions of individual dots are chosen independently, any of a number of reasonable assumptions about direction selectivity imply that the correlation is zero, for example, that the selectivity function

*f*has equal and opposite responses to dots moving in opposite directions. However, we do not to wish to introduce ad hoc assumptions at this point.) Consequently, we will neglect the

*c*

_{0}term, and approximate the covariances as in A6 and A7. Alternatively, in a task where we suspect that the correlation

*c*

_{0}is appreciable, we can construct our stimuli so that

*N*−

_{T}*n*=

_{T}*N*−

_{D}*n*, which according to A13 and A15 implies that cov[

_{D}*T,T**] = cov[

*D,D**] for any values of

*c*

_{0}and

*c*, and A6 and A7 follow trivially.

_{1}*T*and

*T** are approximately jointly normal, and also that

*D*and

*D** are approximately jointly normal.

*T** and

*D**, such that

*T*and

*T** are jointly normal, and

*D*and

*D** are jointly normal.

*C*be a random variable that equals +1 or −1 on trials where the target signal dots move right or left, respectively, and let

*R*be a random variable that equals +1 or −1 on trials where the observer responds “right” or “left,” respectively. Consider the trials on which the target and distractor signal dots move to the right. The expected value of

*T*over all such trials where the observer responds “right” is , and the expected value of

*T*over all such trials where the observer responds “left” is , where

*a*is the observer’s response criterion. These expressions denote the expected value of the normal random variable

*T*, conditional on the correlated normal random variable

*T**+

*kD** falling above or below a criterion. In , we show that if (

*X,Y*) are jointly normal random variables with covariance

*c*, then and where

_{XY}*μ*=

_{Y}*E*[

*y*],

*σ*

^{2}

_{Y}= var[

*Y*], and

*z*= (

*a*−

*μ*)/

_{Y}*σ*. Here

_{Y}*g*is the standard normal probability density function, and

*G*is the standard normal cumulative distribution function. Hence if we define

*μ*=

_{s}*E*[

*s*], and

*σ*

^{2}

_{s}and

*z*= (

*a*−

*μ*)/

_{s}*σ*, then the conditional expected values in question are Here

_{s}*c*is the constant introduced in Equations A6 and A7. Similarly, the conditional mean distractor displacements are . The following more general form of Equation 22 can be confirmed by direct substitution of A16 through A19: If we set

*N*=

_{T}*N*=

_{D}*N*and

*n*, we obtain Equation 22 as a special case. This is the equation that we used to calculate attentional weight in Experiment 1, using the target and distractor dot displacements in Table 1.

_{D}*G*(−

*z*), this covariance evaluates to . Similarly, the covariance of the distractor displacement with the observer’s responses is . Taking the ratio of A23 and A24, we find . With

*N*=

_{T}*N*=

_{D}*N*and

*n*=0, we obtain Equation 23 as a special case. This establishes that if an observer performs the same computation on target and distractor dots, and if selective attention uniformly reduces the influences of the distractor dots, then we can measure the attentional weight using Equation 23 even if the computation yielding the decision variable is unknown and possibly stochastic.

_{D}*T*and

*D*are the total horizontal displacements of the targets and distractors. As we noted at the beginning of the proof, all that matters is that (a)

*T*is uncorrelated with

*D**, and

*D*is uncorrelated with

*T**, as in A4, (b)

*T*−

*μ*and

_{T}*D*−

*μ*are noisy estimates of

_{D}*T**−

*μ*and

_{T*}*D**−

*μ*to within a scale factor, as in A5, and (c) the pairs

_{D*}*T*and

*T**, and

*D*and

*D**, are jointly normal. In effect, we have chosen two measurable stimulus properties

*T*and

*D*as estimates of the unobservable internal responses

*T** and

*D**, and in this appendix, we have shown that we can use the relative influence of the observable variables on the subject’s responses to measure the relative influence of the unobservable variables on the subject’s responses, so long as

*T*and

*D*mirror

*T** and

*D** in these two respects.

*T*and

*D*give of

*T** and

*D**, the more reliable our measurements of

*k*will be. As we have pointed out, the sampling noise method can be seen as measuring the slope of the line connecting points

*M*and

_{L}*M*in Figure 2, which are the mean target and distractor displacements over trials where the observer responds “left” and “right,” respectively. Our estimates of these points are noisy, simply because we can collect only a finite number of trials. This sampling error matters less when the distance between M

_{R}_{L}and M

_{R}is large. Equations A16 through A19 show that the distance between

*M*and

_{L}*M*grows with cov[

_{R}*T,T**] and cov[

*D,D**], so if we choose properties

*T*and

*D*that give good estimates of

*T** and

*D** (i.e., if

*T*and

*D*are strongly correlated with

*T** and

*D**), then the distance between the two points

*M*and

_{R}*M*will be large, and our estimates of

_{L}*k*will be less variable.

*n*dot displacements as a collection of

*n*random variables

*d*, each assuming a value between −π and π to indicate the direction of the corresponding dot displacement, with an angle of 0 indicating a dot moving directly to the right. In a noisy linear model of direction discrimination, we represent the observer’s decision variable as the sum of the responses that the dot displacements

_{i}*d*evoke in a filter, with an internal additive noise

_{i}*Z*added as well. The observer responds “right” when the decision variable exceeds a criterion

*a*. If we describe the directional selectivity of the filter with a function

*f*(

*θ*), we can write the decision variable as . To measure the directional selectivity

*f*(

*θ*), we will examine how the direction of a single dot affects the observer’s responses. We define p

_{ϑR}as the probability of the observer responding “right” when a particular dot

*d*

_{k}moves in direction θ, . Then, Here

*μ*and

*σ*are the mean and standard deviation, respectively, of and

*G*is the standard normal cumulative distribution function. We can solve B3 for

*f*(

*θ*): If the range of

*p*

_{θR}is small, which is to say that the single dot

*d*has only a small effect on the observer’s responses, then we can approximate the inverse cumulative normal G

_{k}^{−1}with the first two terms of a Taylor series. We define p

_{R}as the unconditional probability of the observer responding “right.” Then, Equation B4 becomes That is, if we plot p

_{θR}as a function of the dot direction

*d*, we recover an affine transformation of the directional selectivity function,

_{k}*uf*(

*θ*)+

*ν*.

_{θR}varies only slightly as a function of

*θ*(e.g., between 0.49 and 0.51), and even with a large number of trials, the Bernoulli variability in probability estimates makes it difficult to measure

*f*(

*θ*) accurately. However, we can perform this analysis for each dot displacement

*d*, and average the resulting conditional probabilities: an average of functions of the form

_{k}*uf*(

*θ*)+

*ν*is itself a function of this form, so we can recover the directional selectivity function

*f*(

*θ*) equally well from the much less noisy average of all the conditional probabilities.

*Y*that is imperfectly correlated with a stimulus property

*X*. It is sometimes useful to know the statistics of

*X*, conditional on

*Y*falling above or below a criterion

*a*.

*Theorem*. Let and be two normal random variables with covariance

*c*

_{XY}. Let

*a*be the observer’s criterion, and let

*z*be the normal deviate of

*a*with respect to

*Y*, i.e.,

*z*= (

*a*−

*μ*)/

_{Y}*σ*. Then, (a) (b) . Here

_{Y}*g*(

*x, μ, σ*) is the normal probability density function, and

*G*(

*x, μ, σ*) is the normal cumulative distribution function. When we omit

*μ*and

*σ*and

*σ*, they default to zero and one, respectively.

*Proof*. (a) We can consider

*Y*to be the sum of a term

*kX*that is proportional to

*X*, and a term

*W*that is independent of

*X*, i.e.,

*Y*=

*kX*+

*W*, where cov[

*X,W*]=0. Then, , , and .

*μ*= 0. Integrating by parts, this becomes . Here we have used the fact that the pointwise product of two normal density functions is a scaled normal density function (specifically, , where and , and we have defined and . The density function with parameters

_{X}*μ** and

*σ** integrates to 1, and we are left with .