**Abstract**:

**Abstract**
Thebrain encodes visual information with limited precision. Contradictory evidence exists as to whether the precision with which an item is encoded depends on the number of stimuli in a display (set size). Some studies have found evidence that precision decreases with set size, but others have reported constant precision. These groups of studies differed in two ways. The studies that reported a decrease used displays with heterogeneous stimuli and tasks with a short-term memory component, while the ones that reported constancy used homogeneous stimuli and tasks that did not require short-term memory. To disentangle the effects of heterogeneity and short-memory involvement, we conducted two main experiments. In Experiment 1, stimuli were heterogeneous, and we compared a condition in which target identity was revealed before the stimulus display with one in which it was revealed afterward. In Experiment 2, target identity was fixed, and we compared heterogeneous and homogeneous distractor conditions. In both experiments, we compared an optimal-observer model in which precision is constant with set size with one in which it depends on set size. We found that precision decreases with set size when the distractors are heterogeneous, regardless of whether short-term memory is involved, but not when it is homogeneous. This suggests that heterogeneity, not short-term memory, is the critical factor. In addition, we found that precision exhibits variability across items and trials, which may partly be caused by attentional fluctuations.

*M*measurements over

*N*items (

*M*typically being a very large number), and each individual measurement came with variance

*σ*

^{2}, then the variance by which an item is encoded would be

*Nσ*

^{2}/

*M*, and thus encoding precision or inverse variance would be inversely proportional to set size (Shaw, 1980). (A more continuous version of this argument relies on Fisher information; see Models.) An understanding of whether and how encoding precision depends on the number of relevant items in a scene would have direct implications for many areas of vision research, including multiple-object tracking, visual working memory, peripheral vision, and selective attention, and might be applied to real-world attentionally demanding tasks.

Study | Task | Distractor distribution | Memory task | Effect of set size on precision |

Palmer, 1990 | Discrimination | Heterogeneous | Yes | Yes |

Palmer et al., 1993 | Visual search | Homogeneous | No | No |

Palmer, 1994 | Visual search | Homogeneous | No | No |

Baldassi & Burr, 2000 | Classification | Homogeneous | No | No (*) |

Localization | Homogenous | No | No | |

Wilken & Ma, 2004 | Change detection | Heterogeneous | Yes | Yes |

Delayed estimation | Heterogeneous | Yes | Yes | |

Baldassi & Burr, 2006 | Estimation | Homogeneous | No | No |

Busey & Palmer, 2008 | Visual search | Homogeneous | No | No |

Localization | Homogeneous | No | No (**) | |

Ma & Huang, 2009 | Change discrimination | Heterogeneous | Yes (***) | Yes |

*precue*condition, the target identity was revealed to the subject 1 second prior to the onset of the search display; in the

*postcue*condition, the target identity was revealed 1 second after viewing the display, requiring subjects to memorize the entire display. If memory is the determining factor, we expect to find that precision decreases with set size in the postcue but not in the precue condition. If, on the other hand, display heterogeneity is crucial, we expect that precision decreases with set size in both conditions. In Experiment 2, we compared heterogeneous and homogeneous distractor conditions in a search task that did not require memorization of items. If memory is the determining factor, we expect to find that precision is constant with set size in both conditions. If display heterogeneity is crucial, we expect that precision decreases with set size in the heterogeneous but not in the homogeneous condition. Experiment 3 was a control experiment using homogeneous distractors.

*mean*precision is constant with set size or dependent on set size, but the actual precision for each stimulus is drawn from a probability distribution around that mean. To anticipate our results, we find that search under heterogeneous distractors is best described by the model in which precision fluctuates, and mean precision decreases with set size.

*s*

_{T}, and the probability that the target is present equals 0.5. The target orientation is specified through a precue (Figure 1a, left) or a postcue (Figure 1a, right). When the target is present, its location is chosen randomly. Set size varies from trial to trial.

*C*and takes values 0 and 1. Target presence at the

*i*

^{th}location is denoted

*T*and also takes values 0 and 1. The observer has access to noisy measurements,

_{i}**x**= (

*x*

_{1}, … ,

*x*), of the stimuli,

_{N}**s**= (

*s*

_{1}, … ,

*s*), and infers whether or not a target was present. For convenience, we remap all orientations from (−

_{N}*π*/2,

*π*/2) to (−

*π*,

*π*) in our models and analyses. The measurement of the

*i*

^{th}stimulus,

*x*, follows a Von Mises (circular normal) distribution centered at the true stimulus orientation,

_{i}*s*: where the concentration parameter,

_{i}*κ*is related to precision (see following), and

_{i,}*I*

_{0}is the modified Bessel function of the first kind of order 0. For large

*κ*, a Von Mises distribution is accurately approximated by a Gaussian distribution with

*σ*

^{2}= 1/

*κ*. We will use Gaussian distributions in the case of homogeneous distractors because, there, stimuli and measurements are all concentrated in such a small part of the circular space that the space can be treated as a line.

*p*(

*C*= 1|

**x**), and reports “target present” if this probability is greater than 0.5. This is equivalent to reporting “target present” when the log posterior ratio, denoted

*d*, is positive. To compute the log posterior ratio, we first apply Bayes' rule: where

*p*

_{present}is the observer's prior probability that the target is present. (This does not have to be equal to 0.5, the true frequency of target presence.) The likelihood function of

*C*,

*p*(

**x**|

*C*), is computed by marginalizing over both

**s**and

**T**= (

*T*

_{1}, … ,

*T*). After some basic math, we find where

_{N}*d*is defined as The relationship between

_{i}*d*and

*d*in Equation 2 would be different if distractor orientations were not drawn independently or if more than a single target could be present. When distractors are heterogeneous and drawn from a uniform distribution, the distractor distribution is

_{i}*p*(

*s*|

_{i}*T*= 0) = 1/(2

_{i}*π*). Using this expression as well as Equation 1, Equation 3 becomes (Ma et al., 2011) When distractors are homogeneous with an orientation equal to

*s*

_{D}, we use Gaussian distributions, and Equation 3 becomes (Peterson et al., 1954) We obtained the predictions of the model for an individual trial by drawing 10,000 sets of

*N*measurements each from Von Mises (or Gaussian) distributions centered on the respective stimuli on that trial and applying the decision rule to each set of measurements. This results in a predicted probability that the subject will report “target present” on that trial, $p(\u0108|s,\theta )$, where

**θ**denotes the model parameters.

*κ*, we identify encoding precision with Fisher information,

*J*(

*s*), which measures the best possible decoder performance based on the neural activity encoding the stimulus (Paradiso, 1988; Seung & Sompolinsky, 1993). Fisher information is under the general condition of Poisson-like variability proportional to the amplitude of the population activity encoding

*s*(Ma, Beck, Latham, & Pouget, 2006). Fisher information for a noise distribution

*p*(

*x*|

*s*) is defined as where the expected value 〈s〉 is over the noise distribution

*p*(

*x*|

*s*). Substituting Equation 1, we find that

*J*(

*s*) is independent of

*s*and equal to where

*I*

_{1}is the modified Bessel function of the first kind of order 1. Equation 5 states a general relationship between the precision with which a stimulus is encoded,

*J*, and the concentration parameter,

*κ*. Note that

*J*is a monotonically increasing function of

*κ*and therefore invertible. The equivalent relationship for Gaussian noise is

*J*= 1/

*σ*

^{2}(Palmer, 1990; Shaw, 1980).

*τ*.

- FlatEP (flat, equal precision): Precision does not fluctuate or depend on set size. This model has two free parameters (
*p*_{present}and $ J $). - FlatVP (flat, variable precision): Precision fluctuates but does not depend on set size. This model has three free parameters (
*p*_{present}, $ J $, and*τ*). - npEP (nonparametric, equal precision): Precision does not fluctuate but may depend on set size. This model has five free parameters (
*p*_{present}and one value of $ J $ for each set size). - npVP (nonparametric, variable precision): Precision fluctuates and may depend on set size. This model has six free parameters (
*p*_{present},*τ*, and one value of $ J $ for each set size).

^{2}. Each stimulus was a Gabor patch with a spatial frequency of 1.05 cycles/degree, a standard deviation of 0.52°, and a peak luminance of 132 cd/m

^{2}. The relevant stimulus feature was orientation.

*F*(3, 33) = 24.0,

*p*< 0.001; postcue:

*F*(3, 33) = 56.0,

*p*< 0.001) and false-alarm rates increase (precue:

*F*(3, 33) = 189.1,

*p*< 0.001; postcue:

*F*(3, 33) = 79.4,

*p*< 0.001).

*RMSE*) was 0.037 for precue and 0.049 for postcue.

*RMSE*was 0.11 for precue and 0.12 for postcue).

*t*(11) = −6.07,

*p*< 0.001; postcue:

*t*(11) = −10.20,

*p*< 0.001), confirming that mean encoding precision decreases with set size, both in the precue and in the postcue conditions. Put differently, the standard deviation of the noise increases with set size (Figure 7b). Standard deviation was computed as

*F*(3, 24) = 29.1,

*p*< 0.001) and false-alarm rates increases (

*F*(3, 24) = 56.8,

*p*< 0.001) as a function of set size (Figure 8a). The npVP model provides a slightly better fit than the other models.

*RMSE*values are 0.055, 0.057, 0.046, and 0.040 for the flatEP, flatVP, npEP, and npVP models, respectively.

*RMSE*values are 0.14, 0.12, 0.13, and 0.11 for the flatEP, flatVP, npEP, and npVP models, respectively). Bayesian model comparison shows that the data are most likely under the npVP model (Figure 8c); the log likelihood differences between the npVP model and the flatEP, flatVP, and npEP models were 50 ± 15, 13.8 ± 5.1, and 48 ± 14, respectively. The relationship between mean precision and set size in the npVP model is captured well by a power law function (Figure 9a),with a power of −0.73 ± 0.14. This is consistent with the precue condition of Experiment 1. The scale is different though; this might be due to the fact that the target orientation is fixed across trials.

*F*(3, 24) = 20.58,

*p*< 0.001) but there is no significant effect of set size on the false-alarm rate (

*F*(3, 24) = 1.54,

*p*= 0.23; Figure 10). Unlike Experiment 1 and the heterogeneous condition of Experiment 2, where the random drawing of the target and/or distractor orientations produced a unique set of stimuli on each trial, the homogeneous condition of Experiment 2 had only eight different trial types (target present or absent at four set sizes). Because the npEP and npVP models have five and six free parameters, respectively, fitting these models to the data from the homogeneous condition would likely result in overfitting. Therefore, we replaced these models with variants in which (mean) encoding precision was related to set size by a power law function: $ J ( N ) = J 1 N \alpha $ (plEP model) or $ J ( N ) = J 1 N \alpha $ (plVP model). For each subject, the value of

*α*was taken from the analysis in Figure 9a (heterogeneous condition). Also, we fixed the prior to 0.5.

*α*being a free parameter instead of being determined by the heterogeneous condition. In the plEP model, we found a power of −0.08 ± 0.09 and in the plVP model a power of −0.10 ± 0.08. Neither is significantly different from zero (

*t*(8) = −0.87,

*p*= 0.41;

*t*(8) = −1.28,

*p*= 0.24, respectively). Our results indicate that the mean encoding precision decreases with set size in the heterogeneous condition but is constant with set size in the homogeneous condition (Figure 9a).

*α*being a free parameter instead of being determined by the precue condition of Experiment 1. In the plEP model, we found a power of −0.16 ± 0.10 and in the plVP model a power of −0.20 ± 0.07. Although not significantly different from zero (

*t*(4) = −1.57,

*p*= 0.19;

*t*(4) = −2.83,

*p*= 0.048, respectively), these powers seem slightly more negative than in the heterogeneous condition of Experiment 2.

*t*(11) > 2.7,

*p*< 0.02 and

*t*(11) > 2.3,

*p*< 0.05 for precue and postcue conditions of Experiment 1, respectively, and

*t*(11) > 2.33,

*p*< 0.05 for heterogeneous condition of Experiment 2) and increases with set size (

*F*(3, 33) = 11.33,

*p*< 0.001,

*F*(3, 33) = 34.42,

*p*< 0.001, and

*F*(3, 24) = 7.22,

*p*< 0.001, respectively). This increase is reminiscent of that found in visual short-term memory studies (Bays, Catalao, & Husain, 2009; Zhang & Luck, 2008; Van den Berg et al., 2012). Here, we argue that this guessing is only apparent and accounted for by the npVP model.

*N*= 8 in the heterogeneous condition of Experiment 2; Figure 8b). None of the models was able to account for this dip. A speculative account of this effect could be that search proceeds in two stages, again somewhat in the vein of Reverse Hierarchy Theory. In the first stage, orientations close to the target orientation are identified, while in the second stage, attentional resources are deployed to those selected orientations. Then, as the distractors collectively become more dissimilar to the target, the number of items that needs to be scrutinized decreases, thereby increasing performance.

*PLoS Biology**,*4 (3), e56.

*Vision Research**,*40 (10–12), 1293–1300. [CrossRef] [PubMed]

*, 2 (8): 3,559–570, http://www.journalofvision.org/content/2/8/3, doi:10.1167/2.8.3. [CrossRef]*

*Journal of Vision**, 9 (10): 7,1–11, http://www.journalofvision.org/content/9/10/7, doi:10.1167/9.10.7. [CrossRef] [PubMed]*

*Journal of Vision**(pp.411–416). Austin, TX: Cognitive Science.*

*Proceedings of the 32nd Annual Conference of the Cognitive Science Society*

*Journal of Experimental Psychology: Human Perception and Performance**,*34 (4), 790–810. [CrossRef] [PubMed]

*Behavioral and Brain Sciences**,*24 (1), 87–114. [CrossRef] [PubMed]

*Psychological Review**,*96 (3), 433–458. [CrossRef] [PubMed]

*Nature Neuroscience**,*14, 926–932. [CrossRef] [PubMed]

*. Los Altos, CA: John Wiley & Sons.*

*Signal detection theory and psychophysics*

*Neuron**,*36 (5), 791–804. [CrossRef] [PubMed]

*Nature Neuroscience**,*9 (11), 1432–1438. [CrossRef] [PubMed]

*, 9 (11): 3,1–30, http://www.journalofvision.org/content/9/11/3, doi:10.1167/9.11.3. [CrossRef] [PubMed]*

*Journal of Vision*

*Nature Neuroscience**,*14, 783–790. [CrossRef] [PubMed]

*. Cambridge, UK: Cambridge University Press.*

*Information theory, inference, and learning algorithms*

*Neuropsychologia**,*47 (14), 3255–3264. [CrossRef] [PubMed]

*Journal of Experimental Psychology: Human Perception and Performance**,*16 (2), 332–350. [CrossRef] [PubMed]

*Vision Research**,*34 (13), 1703–1721. [CrossRef] [PubMed]

*Journal of Experimental Psychology: Human Perception and Performance**,*19 (1), 108–130. [CrossRef] [PubMed]

*Vision Research**,*40 (10–12), 1227–1268. [CrossRef] [PubMed]

*Biological Cybernetics**,*58 (1), 35–49. [CrossRef] [PubMed]

*Vision Research**,*45 (14), 1867–1875. [CrossRef] [PubMed]

*Transactions IRE Profession Group on Information Theory, PGIT-4**,*171–212.

*Quarterly Journal of Experimental Psychology**,*32 (1), 3–25. [CrossRef] [PubMed]

*Journal of Experimental Psychology: Human Perception and Performance**,*27 (4), 985–999. [CrossRef] [PubMed]

*Proceedings of National Academy of Sciences, USA**,*90 (22), 10749–10753. [CrossRef]

*(Vol. VIII. pp.277–296). Hillsdale, NJ: Erlbaum.*

*Attention and performance**published online May 11, 2012, doi: 10.1073/pnas.1117465109.*

*Proceedings of National Academy of Sciences, USA,**, 9 (5): 15,11–11, http://www.journalofvision.org/content/9/5/15, doi:10.1167/9.5.15. [CrossRef] [PubMed]*

*Journal of Vision*

*Journal of Mathematical Psychology**,*44 (1), 92–107. [CrossRef] [PubMed]

*, 4 (12): 11,1120–1135, http://www.journalofvision.org/content/4/12/11, doi:10.1167/4.12.11. [CrossRef]*

*Journal of Vision*

*Nature**,*453 (7192), 233–235. [CrossRef] [PubMed]

*m*produces a prediction about the response on each trial,

*p*(

*Ĉ*|

_{k}**s**

*,*

_{k}**θ**), where

*Ĉ*indicates the observer's response on trial

_{k}*k*,

**s**

*the presented set of stimuli, and*

_{k}**θ**the model parameters. The parameter likelihood function is the probability of finding a subject's actual responses under the model as a function of

**θ**, where we assume that responses are conditionally independent across trials. Maximum-likelihood estimation consists of finding the parameters

**θ**that maximize

*L*.

**θ**. For the

*j*

^{th}parameter, we assume a uniform distribution on an interval whose size we denote

*R*. Then Equation 6 becomes where dim

_{j}**θ**is the number of parameters. Intervals were (0.3, 0.7) for

*p*

_{present}, (0.5,100) for

*J*in Experiment 1, (10, 200) for

*τ*in Experiment 1, (5, 400) for

*J*in Experiment 2, and (25, 500) for

*τ*in Experiment 2. We approximated the integrals over parameters numerically by using the trapezoidal rule with 25 steps for

*p*

_{present}and 30 steps for all other parameters. Finally, log

*L*(

*m*) is compared between different models

*m*.

Experiment | J | p_{present} |

1, Precue | 6.52 ± 0.96 | 0.48 ± 0.01 |

1, Postcue | 6.5 ± 1.1 | 0.44 ± 0.01 |

2, Heterogeneous | 47 ± 11 | 0.42 ± 0.02 |

2, Homogeneous | 235 ± 36 | N/A (*) |

Experiment | J | p_{present} | τ |

1, Precue | 13.0 ± 1.7 | 0.53 ± 0.01 | 13.08 ± 0.82 |

1, Postcue | 14.8 ± 2.8 | 0.49 ± 0.01 | 26.21 ± 5.0 |

2, Heterogeneous | 135 ± 14 | 0.50 ± 0.03 | 277.58 ± 59 |

2, Homogeneous | 256 ± 33 | N/A (*) | 40 ± 11 |

Experiment | J_{1} | J_{2} | J_{3} | J_{4} | p_{present} |

1, Precue | 13.03 ± 2.37 | 6.84 ± 1.54 | 5.56 ± 0.94 | 5.64 ± 0.90 | 0.49 ± 0.01 |

1, Postcue | 13.6 ± 2.6 | 6.2 ± 2.0 | 5.0 ± 1.2 | 4.9 ± 1.4 | 0.45 ± 0.01 |

2, Heterogeneous | 88 ± 15 | 42 ± 12 | 44 ± 11 | 42 ± 10 | 0.42 ± 0.02 |

Experiment | J _{1} | J _{2} | J _{3} | J _{4} | p_{present} | τ |

1, Precue | 25.9 ± 3.4 | 16.0 ± 2.7 | 11.8 ± 2.0 | 10.6 ± 2.0 | 0.54 ± 0.01 | 18.7 ± 2.9 |

1, Postcue | 40.4 ± 7.1 | 21.9 ± 5.2 | 17.9 ± 4.2 | 12.2 ± 3.5 | 0.51 ± 0.01 | 51 ± 12 |

2, Heterogeneous | 246 ± 27 | 138 ± 12 | 112 ± 11 | 92 ± 15 | 0.51 ± 0.03 | 296 ± 62 |