M. Carrasco and B. McElree (2001) presented a speed–accuracy trade-off experiment, investigating covert attention in visual search. One of the conclusions from Carrasco and McElree was that adding distractors to a single feature search does not decrease the speed with which information is accumulated about target identity. We present a reanalysis of the relevant data from Carrasco and McElree in which we demonstrate that their conclusion was incomplete and we demonstrate a processing speed advantage for single feature search displays with no distractors compared with displays with distractors. This finding is confirmed in a new speed–accuracy trade-off experiment presented here. Further, we demonstrate that increasing the display duration increases the processing speed of displays with distractors but not for displays without distractors. We discuss these results in relation to theories of visual attention and the debate between graded and fixed architecture accounts for attentional allocation.

^{1}tasks with few distractors has been thought not to benefit from focal attention and so not affected by the addition of distractors (e.g., Carrasco, McElree, Denisova, & Giordano, 2003; Henderickx, Maetens, Geerinck, & Soetens, 2009; Lee, Kock, & Braun, 1997; Treisman & Gelade, 1980).

*a priori*unaware of how much time they will be allowed to process the stimulus and respond. On a trial, some time after stimulus onset (usually between tens of milliseconds and up to several seconds), a signal is given (e.g., a loud tone) at which point the (trained) observer is required to respond within a small time frame, usually around 300 ms after the signal. A number of signal intervals are used such that performance can be mapped from chance responding (at very short signal intervals) to asymptotic performance (at longer intervals). Task performance (typically

*d*′ for a two-choice task) is plotted as a function of processing time

*t*(signal interval plus response lag) for each condition of the experiment. A simple shifted exponential-rise-to-asymptote function is then fit to the SAT data:

*λ*is the asymptotic level of discriminability,

*β*is the rate at which discriminability rises from guessing (

*d*′ = 0) to asymptote, and

*δ*is the intercept—the time at which responding is no longer at chance (the “takeoff” point).

^{2}Typically, a hierarchical model testing approach is used to find the optimal model (in terms of goodness of fit and the numbers of free parameters) starting with the fully restricted null model (with only the three free parameters for all conditions) moving toward the fully saturated model (three free parameters:

*λ, β,*and

*δ,*for each condition of the experiment). Figure 1 plots idealized SAT curves for a two-condition experiment. The top panel shows the case where two conditions differ in asymptote, whereas the bottom panel shows the case where the two conditions differ in rate (a difference in takeoff is not shown but would simply involve shifting one of the SAT curves to the left or right).

*λ, β,*and

*δ*), denoted as 3

*λ*–3

*β*–3

*δ*. Carrasco and McElree reported that they used three criteria for determining which model was most appropriate: (i) the value of an

*R*

^{2}statistic adjusted for the number of free parameters:

*d*

_{ i }is the observed data,

_{ i }is the predicted value,

*n*is the number of data points, and

*k*is the number of free parameters; (ii) the consistency of the parameter estimates across all three observers; and (iii) whether any fit left any systematic residuals that could be captured by additional free parameters. Carrasco and McElree ultimately ended up declaring a 3

*λ*–1

*β*–1

*δ*model as the best to describe their data (

*R*

^{2}= 0.970). This model only allows the asymptotic levels of performance to vary across set sizes, with equal takeoff and rate parameters; Carrasco and McElree state: “For the

*neutral*feature search, processing time was unaffected by set size. Model fits that varied intercept or rate as a function of set size reduced the adjusted-

*R*

^{2}for each observer and for the average data, indicating that the additional parameters were not accounting for systematic variance in the data” (p. 5365). However, it is unclear whether the authors attempted to fit models that allowed rates to vary between set sizes other than in an all-or-none fashion and whether the reduced fit was due to combining attempts to fit the neutral cue condition and the peripherally cued condition simultaneously.

Parameter | Original | New |
---|---|---|

Discriminability (λ in d′ units) | ||

Set size 1 | 1.78 | 1.70 |

Set size 4 | 1.46 | 1.51 |

Set size 8 | 1.45 | 1.50 |

Rate (β in ms units) | ||

Set size 1 | 114 | 74 |

Set sizes 4 and 8 | 137 | |

Intercept (δ in ms units) | ||

All set sizes | 293 | 299 |

*λ*–1

*β*–1

*δ*) but allowed the individual rate parameters to vary by set size. In addition, we also fit the data using maximum likelihood criteria (in effect minimizing log-likelihood, Ln(

*L*), as well as the adjusted-

*R*

^{2}value) that allow us to statistically test differences in goodness of fit between nested models. Both statistics gave the same results in terms of model selection. Like Carrasco and McElree, we also fit the model to individual participants and the averaged data (presented) that gave largely consistent patterns of results. The 3

*λ*–1

*β*–1

*δ*model gives an

*R*

^{2}value of 0.970, Ln(

*L*) = 16.712. However, a close inspection of Figure 2 (top panel) reveals that the model fit for the set size 1 condition slightly overestimates the asymptote and underestimates several of the data points critical for estimating the rate parameter. Thus, we fit a model that included a separate free rate parameter for set size 1, 3

*λ*–2

*β*–1

*δ,*which resulted in a significantly better fit,

*R*

^{2}= 0.982, Ln(

*L*) = 19.09 (a likelihood ratio test comparing the 3

*λ*–1

*β*–1

*δ*and 3

*λ*–2

*β*–1

*δ*models confirmed the improved fit,

*X*

^{2}(1) = 4.756,

*p*= 0.029). All other models (varying both

*β*and

*δ*) failed to improve upon this fit. The 3

*λ*–2

*β*–1

*δ*model fits can be seen in Figure 2 (bottom panel); the model now more closely captures the data for set size 1. The new parameter values are given in Table 1. As well as better capturing the rise to asymptote, by allowing the rate parameter for set size 1 to vary, the asymptote parameter has also been reduced, enabling the model to capture the asymptote, which before was overestimated. The set size 1 rate is now faster at 74 ms (instead of 114 ms as estimated originally) than the rate for set sizes 4 and 8 at 137 ms. Thus, as well as leading to greater asymptotic accuracy, having only one object displayed leads to significantly faster visual information accumulation, indeed much closer to the estimated 69 ms for the peripherally cued condition (Carrasco & McElree, 2001, Table 1, p. 5366), suggesting that cuing and removing distractors reduce the demand on attention to a similar degree. This finding is consistent with the idea that attention bestows greater discriminability but also a faster rate of information accumulation when there are no distractors present than when distractors are present, similar to the effect of cuing.

*per se*(or, for example, increased perceived stimulus contrast) is affecting a change in performance. In order to maintain comparability to Carrasco and McElree's experiment, we do not use masks in our experiment. Regardless of the underlying process, increasing exposure duration in our task (with other factors held constant) should result in a more reliable stimulus representation (cuing appears to be more effective for masked than non-masked stimuli; see Kerzel, Gauch, & Buetti, 2010, and since masking may shorten the information accrual period, there is clearly no simple relationship between accrual time and performance).

*λ*+ 2

*β*+ 2

*δ*) and then look at parametric differences due to set size (which could have a maximum of 12 free parameters associated with it: 4

*λ*+ 4

*β*+ 4

*δ*). Thus, the saturated model (allowing parametric variance between all stimulus exposure durations and set sizes) would be 8

*λ*–8

*β*–8

*δ*. We started with the null model, 1

*λ*–1

*β*–1

*δ,*that assumes no parametric differences between any of the set sizes or stimulus exposure durations, which provided a baseline goodness of fit to judge other models against,

*R*

^{2}= 0.966, Ln(

*L*) = 14.955. Next, we allowed either the asymptotes (2

*λ*–1

*β*–1

*δ*), rates (1

*λ*–2

*β*–1

*δ*), or takeoffs (1

*λ*–1

*β*–2

*δ*) to vary by stimulus duration. The largest improvement in fit was seen by allowing the rates to vary (1

*λ*–2

*β*–1

*δ*),

*R*

^{2}= 0.972, Ln(

*L*) = 26.41 (

*X*

^{2}(1) = 22.91,

*p*< 0.001), compared to

*R*

^{2}= 0.971, Ln(

*L*) = 23.99 (

*X*

^{2}(1) = 18.07,

*p*< 0.001) and

*R*

^{2}= 0.969, Ln(

*L*) = 20.49 (

*X*

^{2}(1) = 11.07,

*p*< 0.001) for when the asymptotes (2

*λ*–1

*β*–1

*δ*) or takeoffs (1

*λ*–1

*β*–2

*δ*), respectively, were allowed to vary. The relative likelihood for allowing asymptotes to vary over rates is 0.09 and for the takeoffs over rates, 0.003. Allowing rates and asymptotes and/or takeoffs to vary with stimulus duration resulted in little improvement in fit (indeed the most general version, (2

*λ*–2

*β*–2

*δ*),

*R*

^{2}= 0.971, Ln(

*L*) = 27.161, resulted in no significant improvement compared with the 1

*λ*–2

*β*–1

*δ*model,

*X*

^{2}(2) = 1.502,

*p*= 0.47). As the likelihood of a rate difference between stimulus durations is greatest, we chose this model as the basis for future model comparisons.

*λ*–8

*β*–1

*δ,*fit our data very well,

*R*

^{2}= 0.984, Ln(

*L*) = 53.90, suggesting a differences in rates due to set size (

*X*

^{2}(6) = 54.98,

*p*< 0.001). The 1

*λ*–8

*β*–1

*δ*rate model fit our data better than allowing asymptotes (4

*λ*–2

*β*–1

*δ, R*

^{2}= 0.975, Ln(

*L*) = 34.912, a relative likelihood ratio of <0.001 compared with allowing rates to vary) or takeoffs (1

*λ*–2

*β*–4

*δ, R*

^{2}= 0.974, Ln(

*L*) = 33.370, a relative likelihood ratio of <0.001 compared with allowing rates to vary) to vary by set size. The comparable model to that supported in the reanalysis of Carrasco and McElree's data, 1

*λ*–4

*β*–1

*δ*model (

*R*

^{2}= 98.5, Ln(

*L*) = 52.06), which allows variability in processing rates between the two exposure durations and between set size 1 and set sizes 2, 3, and 4, did not fit the data significantly less well than the more general 1

*λ*–8

*β*–1

*δ*model (

*X*

^{2}(

*4*) = 3.68,

*p*= 0.45), which allowed all the set sizes to vary from each other and between stimulus exposure durations. Comparing the 1

*λ*–4

*β*–1

*δ*model to one in which there is no parametric variability in rates due to set sizes, the 1

*λ*–2

*β*–1

*δ*model, demonstrates the differences between set size 1 and set sizes 2, 3, and 4, replicating the effect demonstrated in Carrasco and McElree's data (

*X*

^{2}(2) = 51.28,

*p*< 0.001). Finally, we tested whether all set sizes resulted in different processing rates and whether all set sizes differed between stimulus durations. For brevity, out of all possible model combinations, the optimal model was 1

*λ*–3

*β*–1

*δ, R*

^{2}= 0.985, Ln(

*L*) = 52.052, which did not result in a significantly poorer fit than the most general 1

*λ*–8

*β*–1

*δ*model in which all set sizes could vary from each other and between stimulus exposure durations, despite having five fewer free parameters (

*X*

^{2}(5) = 3.69,

*p*= 0.59). This model only had parametric variability between the 40- and 140-ms conditions for the rate parameters for set size 2, set size 3, and set size 4, which all had a shared slower rate than set size 1.

^{3}The fits of this model can be seen in Figure 4, and the parameters are given in Table 2. In summary, we found no evidence of either the takeoffs or asymptotes varying between stimulus duration and set size. The processing rate was fastest when only one item was presented, and stimulus duration did not affect this rate. However, when more than one item was present, processing was slower than when there were no distractors, but processing speed increased with increased stimulus duration.

Parameter | 40 ms | 140 ms |
---|---|---|

Discriminability (λ in d′ units) | ||

All set sizes | 2.77 | |

Rate (β in ms units) | ||

Set size 1 | 111 | |

Set sizes 2, 3, and 4 | 172 | 131 |

Intercept (δ in ms units) | ||

All set sizes | 270 |

*number*of distractors (and not a change in the takeoff time). Most likely, since we used homogenous distractors, there is the possibility of distractors being grouped perceptually (Humphreys, Quinlan, & Riddoch, 1989). Using a set of distractors that do not lead to perceptual grouping in a simple detection task might produce a gradual decline in processing speed between one and four items. Indeed, evidence from Bricolo, Gianesini, Fanini, Bundesen, and Chelazzi (2002) suggests that with heterogeneous distractors (in a very inefficient search task) there are processing rate differences between set sizes of two, four, six, and eight items. However, Bricolo et al. did not use a SAT procedure and instead estimated cumulative distribution curves from free RTs (see their Figure 3, p. 985) and used a traditional detection search task. Nonetheless, the pattern of data from Bricolo et al. suggests that a heterogeneous set of distractors may lead to rate differences between set sizes of two, three, and four items. Indeed, in Carrasco and McElree's (2001) conjunction search, with heterogeneous distractors, there is a clear difference in processing rates between set sizes of 1, 4, and 8.