**A central question in the study of visual short-term memory (VSTM) has been whether its basic units are objects or features. Most studies addressing this question have used change detection tasks in which the feature value before the change is highly discriminable from the feature value after the change. This approach assumes that memory noise is negligible, which recent work has shown not to be the case. Here, we investigate VSTM for orientation and color within a noisy-memory framework, using change localization with a variable magnitude of change. A specific consequence of the noise is that it is necessary to model the inference (decision) stage. We find that (a) orientation and color have independent pools of memory resource (consistent with classic results); (b) an irrelevant feature dimension is either encoded but ignored during decision-making, or encoded with low precision and taken into account during decision-making; and (c) total resource available in a given feature dimension is lower in the presence of task-relevant stimuli that are neutral in that feature dimension. We propose a framework in which feature resource comes both in packaged and in targeted form.**

*K*objects regardless of their number of features (Luck & Vogel, 1997; Vogel et al., 2001). However, adding a second color feature to a color-defined object does decrease performance (Wheeler & Treisman, 2002). These findings can be reconciled by a model in which each feature dimension has a separate capacity (Olson & Jiang, 2002; Wheeler & Treisman, 2002). More recently, this model has been challenged by effects of number of features in objects with up to six features (Hardman & Cowan, 2015; Oberauer & Eichenberger, 2013).

*N*objects each have two features to one in which 2

*N*objects each have one (Lee & Chun, 2001; Olson & Jiang, 2002; Xu, 2002). Performance was found to be lower in the latter condition, suggesting that VSTM is weakly object-based (Olson & Jiang, 2002). However, the fact that spatial attention is divided over more objects in the second condition complicates this conclusion.

- Question 1 (Q1): Resource allocation among feature dimensions. In a noisy-memory view, the analog of the question “Does each feature dimension have a separate capacity?” (Olson & Jiang, 2002; Wheeler & Treisman, 2002) would be “Is memory resource shared among feature dimensions of an object, or does each feature dimension have its own pool of resource?”
- Question 2 (Q2): Encoding and role in decision-making of irrelevant features. Does an irrelevant feature receive resource? If the answer is yes, there is a follow-up question: In the process of deciding which location contains the change based on noisy memories (Keshvari et al., 2012, 2013; Ma & Huang, 2009; Wilken & Ma, 2004), does the irrelevant feature get ignored? Distinguishing not encoded from “encoded but ignored during decision-making” addresses a confound already present in the noiseless view of VSTM (see H2 above).
- Question 3 (Q3): Spatial allocation of resource. H3 predicts that dividing the features of
*N*two-feature objects over 2*N*one-feature objects decreases performance. Here, we ask by analogy whether dividing the features of*N*two-feature objects over 2*N*one-feature objects decreases the amount of resource in a given feature dimension that is available to encode the*N*feature values in that dimension. In addressing this question, we will account in the analysis for the main effect that localizing a target (here: a change) among 2*N*items is intrinsically harder than among*N*items.

- We use the relatively rare paradigm of
*change localization*. In change detection, chance level is 0.5. In change localization with*N*items, chance level is 1/*N*. This affords a larger performance range, potentially allowing the predictions of different models to separate more. We have previously used change localization to distinguish one-feature VSTM encoding models (Van den Berg et al., 2012). - We vary the magnitude of change. Doing so allows for a precise description of the role of memory noise: more noise means a shallower psychometric curve over change magnitude (Bays & Husain, 2008; Keshvari et al., 2012, 2013; Lara & Wallis, 2012; Pearson et al., 2014; Van den Berg et al., 2012). Using highly discriminable stimuli across a change would amount to measuring only one point on this psychometric curve—and it is even unclear which point. Concluding that performance at that point is the same in two conditions leaves open the possibility that it is different at other points. Therefore, measuring full psychometric curves over change magnitude allows for somewhat stronger conclusions on how performance is affected than using highly discriminable stimuli.
- In most analyses, we will use quantitative process models to support our qualitative conclusions. We use the term
*process model*to indicate a model that obtains predictions for psychometric curves by concatenating generic and/or principled assumptions about encoding (storage) and decision (retrieval), instead of postulating a functional form for the psychometric curves without any justification other than goodness of fit. Quantitative process models are useful because they allow us to separate encoding from decision components, because they predict the entire shape of the psychometric curve in each condition rather than only the presence of a difference between conditions, and because they allow for within-subject model comparison.

*every*object changed in its irrelevant feature dimension. All changes in the irrelevant feature dimension as well as the change in the relevant feature dimension were independently drawn from a uniform distribution. We instructed subjects that their task was to localize the change in the relevant feature dimension, that the change could be small or large, and that all objects would change in their irrelevant feature dimension. All-change Condition D was designed to magnify any potential effect of the irrelevant feature compared to one-change Condition D.

*process*models (a) predict entire psychometric curves across conditions, rather than only the presence or absence of an effect of condition on accuracy; (b) allow us to disentangle differences between conditions in encoding precision (of primary interest) from differences between conditions in the decision process (less of interest). The price to pay for these gains is that additional assumptions must be made. Our main assumptions are as follows.

- For our single-feature encoding model, we choose the variable-precision model (Keshvari et al., 2012; Van den Berg et al., 2012; Van den Berg et al., 2014), in which encoding precision is itself a random variable. Although this model is relatively simple and has a good track record (Ma et al., 2014), many other models have been proposed to describe how short-term memories are encoded in a noisy fashion (Bays, 2014; Cappiello & Zhang, 2016; Oberauer & Lin, 2017; Sims, Jacobs, & Knill, 2012; Van den Berg et al., 2014; Zhang & Luck, 2008). The present study cannot rule those models out, and at least qualitatively, the differences between conditions predicted by the variable-precision model can also be accounted for by the other models.
- For the observer's decision, we choose a Bayes-optimal (ideal) observer. This decision model described human data best in a similar change detection task (Keshvari et al., 2012), but other, suboptimal rules, can certainly not be ruled out.

*N*objects. We denote the vector of stimuli in the first array by

**θ**= (

*θ*

_{1}, … ,

*θ*). Every

_{N}*θ*is independently drawn from a uniform distribution. In going from the first to the second stimulus array, exactly one object (drawn with equal probabilities) changes its value; we denote the location where this happens by

_{i}*L*. The magnitude of the change, denoted by Δ, is drawn from a uniform distribution. The vector of stimuli in the second array is denoted by

**φ**= (

*φ*

_{1}, … ,

*φ*). Since the change occurs in one object,

_{N}**φ**and

**θ**are identical except for the

*L*th entry, where

*φ*=

_{L}*θ*+ Δ.

_{L}*θ*, the observer has a noisy measurement (memory) of this stimulus, which we denote by

_{i}*x*. Similarly, we denote the noisy measurement of

_{i}*φ*by

_{i}*y*. We denote the measurement vectors by

_{i}**x**= (

*x*

_{1}, … ,

*x*), and

_{N}**y**= (

*y*

_{1}, … ,

*y*). We assume that the noise corrupting the measurements is independent across arrays and locations. Since orientation and color are circular variables in our experimental design, we assume that

_{N}*x*and

_{i}*y*follow Von Mises distributions (with orientation rescaled to have the range [0,2π]):

_{i}*I*

_{0}is the modified Bessel function of the first kind of order zero (Abramowitz & Stegun, 1972), and

*κ*and

_{x,i}*κ*are called concentration parameters, which we will assume to be stochastic themselves (see below). Both the Von Mises and the independence assumptions are simplifications, but they fit the present data well. Moreover, most of our qualitative conclusions are supported by model-free statistics and are unlikely to be sensitive to changes in model assumptions.

_{y,i}*J'*

_{array,}

*as drawn independently across arrays, locations*

_{i}*i*, and trials from a gamma distribution with mean

*τ*; for this stochastic process, we will use the notation

*J'*

_{array,}

*is multiplied by a bottom-up factor*

_{i}*α*to yield the value of encoding precision at the

*i*th location in a given array:

*κ*

_{array,}

*through*

_{i}*I*

_{1}is the modified Bessel function of the first kind of order one (Abramowitz & Stegun, 1972). This relationship, which is nearly linear, follows from the interpretation of precision as Fisher information (Van den Berg et al., 2012; Van den Berg et al., 2014).

*d*the likelihood ratio, based on

_{L}*x*and

_{L}*y*, that a change occurred at the

_{L}*L*th location, disregarding all other locations. It turns out that an accuracy-maximizing observer will report the location for which

*d*is highest (see Appendix 01).

_{L}**θ**and

**φ**do not matter otherwise. Moreover, the observer's response is completely characterized by whether it is correct or not; all incorrect responses are equivalent. For a given parameter combination,

**ω**, and a given change magnitude Δ, we calculated the probability of a correct response, denoted by

*p*(correct|Δ,

**ω**), through Monte Carlo simulation. This entailed the following. We generated 1,280 samples of

*J*and

_{x,i}*J*for the first and second stimulus array according to the process described under Step 1. We used these values to compute concentration parameters

_{y,i}*κ*and

_{x,i}*κ*according to Equation 3, and we drew measurement vectors

_{y,i}**x**and

**y**from Von Mises distributions with those concentration parameters, respectively. Those Von Mises distributions all had mean 0 except that one element of

**y**was drawn from a Von Mises distribution with mean Δ. For each of the 1,280 combinations of

**κ**

*,*

_{x}**κ**

*,*

_{y}**x**, and

**y**thus drawn, we evaluated the decision rule. Tallying correct responses across these draws yielded an estimate of the probability of a correct localization response for the given parameter combination and the given change magnitude. We used these probabilities for model fitting.

**θ**

_{ori},

**θ**

_{col},

**φ**

_{ori},

**φ**

_{col}). In Condition B, both features are relevant, and a change occurs in one feature of one object. In Conditions C and D, there is a relevant and an irrelevant feature. In Condition C, the irrelevant feature, denoted by a subscript “irr”, never changes, and thus,

**φ**

_{irr}=

**θ**

_{irr}. In Condition D, the irrelevant feature might change. In the one-change condition, an irrelevant change of magnitude Δ

_{irr}is introduced at the

*L*th location:

*φ*

_{L}_{,irr}=

*θ*

_{L}_{,irr}+ Δ

_{irr}. In the all-change condition, an irrelevant change of magnitude is introduced at every location:

**φ**

_{irr}=

**θ**

_{irr}+

**Δ**

_{irr}, where

**Δ**

_{irr}= (Δ

_{1,}

_{irr}… , Δ

_{N}_{,irr}), and Δ

_{i,}_{irr}is independently drawn from a uniform distribution for every

*i*.

**x**

_{ori},

**x**

_{col},

**y**

_{ori},

**y**

_{col}), are drawn. We assume that the noise corrupting the orientation and color measurements is independent across arrays, locations, and features (Fougnie & Alvarez, 2011).

*ρ*< 1. We assume independent variability for orientation and color:

*τ*'

_{ori}and

*τ*'

_{col}are the scale parameters for orientation and color, respectively. We multiply the drawn resource values

*J*'

_{ori}and

*J*'

_{col}by bottom-up factors

*α*

_{ori}and

*α*

_{col}to obtain

*J*

_{ori}and

*J*

_{col}, respectively. This process can be simplified as

*τ*

_{ori},

*τ*

_{col}, and

*ρ*.

*relevant*orientation and color follow:

*ρ*

_{ori}and

*ρ*

_{col}. In Model 4, mean precision for

*irrelevant*orientation is then

*irrelevant*color is

*τ*

_{ori}and

*τ*

_{col}are identical to when the features are relevant.

*τ*

_{ori}, and

*τ*

_{col}, and Model 4 has two more,

*ρ*

_{ori}and

*ρ*

_{col}.

**Δ**

_{ori}and

**Δ**

_{col}, respectively, which depend on the condition (e.g., in Condition C, one of the vectors is the zero vector, whereas in Condition D, neither is). In each experimental condition, we estimated the model observer's probability of a correct response,

*p*(correct|

**Δ**

_{ori},

**Δ**

_{col},

**ω**), through Monte Carlo simulations with 1,280 samples of

**κ**

_{ori,}

*,*

_{x}**κ**

_{ori,}

*,*

_{y}**x**

_{ori},

**y**

_{ori},

**κ**

_{col,}

*,*

_{x}**κ**

_{col,}

*,*

_{y}**x**

_{col}, and

**y**

_{ori}.

**ω**is

*L*(

**ω**) =

*p*(data|

**ω**). We maximized this function over

**ω**, which is equivalent to maximizing its logarithm. Assuming that all trials are conditionally independent, the log-likelihood function for a given condition is

*N*

_{trials}is the number of trials in the condition (in Condition A, only one of either

**Δ**

_{ori,}

*and*

_{t}**Δ**

_{col,}

*exists). We fitted all conditions within an experiment simultaneously, and thus the log-likelihood gets summed across all conditions. For maximizing the resulting total log likelihood, we used the genetic algorithm (“ga”) in the Global Optimization Toolbox in MATLAB, with 64 individuals and 64 generations.*

_{t}^{2}. Stimuli were equally spaced along an imaginary circle of radius 7 degrees of visual angle (deg) around fixation (calculated assuming a viewing distance of 60 cm), at angles [45 + (

*I*− 1)/

*N*× 360]°, where

*i*= 1, … ,

*N*, and

*N*= 4. All experiments were programmed using Psychophysics Toolbox in MATLAB (Brainard, 1997; Pelli, 1981).

^{2}with minor and major axes of 0.41 and 0.94 degrees of visual angle (deg), respectively. A color-only stimulus was a disc with a diameter of 0.62 deg, with color drawn from 360 values uniformly distributed along a circle in the fixed-

*L** plane of CIE 1976 (

*L**,

*a**,

*b**) color space corresponding to a luminance of 95.7 cd/m

^{2}, with center (

*a**,

*b**) = (12, 13) and radius 60. A two-feature stimulus was a colored ellipse.

*β*= 0.035 ± 0.002;

*t*test on

*β*values:

*t*(7) = 6.89,

*p*< 0.001, but no significant effect of condition,

*β*= −0.018 ± 0.029,

*t*(7) = −0.21,

*p*= 0.84. For color, we similarly found a significant effect of change magnitude,

*β*= 0.035 ± 0.002,

*t*(7) = 7.23,

*p*< 0.001, and no significant effect of condition,

*β*= −0.128 ± 0.028,

*t*(7) = −1.53,

*p*= 0.17. Since Models 1 and 2 predict a significant effect of condition, we find no evidence for Models 1 or 2.

*M*±

*SEM*), respectively. This confirms that we can reject Models 1 and 2.

*N*= 8). Otherwise, Experiment 2 was the same as Experiment 1.

*β*= 0.039 ± 0.001;

*t*test on

*β*values:

*t*(7) = 9.01,

*p*< 0.001, but no significant effect of condition,

*β*= 0.025 ± 0.029,

*t*(7) = 0.28,

*p*= 0.79. For color, we similarly found a significant effect of change magnitude,

*β*= 0.037 ± 0.002,

*t*(7) = 8.00,

*p*< 0.001, and no significant effect of condition,

*β*= −0.022 ± 0.032,

*t*(7) = −0.23,

*p*= 0.83.

*n*= 600) were asked to recall one of the feature values. The subject's response is taken to be a proxy of their memory of the stimulus and thus, the task has only a minimal decision-making component. In the first 30 trials, the task was to recall the value of the relevant feature; across these trials, which feature was relevant was kept fixed. On the surprise 31st trial, an on-screen message instructed subjects to recall the value of the irrelevant feature of the stimulus that they just saw. Model 6 predicts that subjects will be guessing on the surprise trial and that their estimation errors will be distributed uniformly. We found that for both orientation and color, the error distribution on irrelevant-feature trials was not uniform (Figure 5). This indicates that subjects do encode the irrelevant feature and rules out Model 6. However, the quality of encoding of the irrelevant feature was rather poor: The inverse of the square of the circular standard deviation when orientation was irrelevant was only 3.8% of the same quantity when orientation was relevant, and for color, that ratio was 20.4%.

*β*= 0.044 ± 0.002;

*t*test on

*β*values:

*t*(4) = 7.58,

*p*= 0.002; no significant difference between Condition C and the other conditions,

*β*= −0.066 ± 0.044,

*t*(4) = −0.60,

*p*= 0.58); and no significant difference between one-change Condition D and the other conditions,

*β*= 0.034 ± 0.036,

*t*(4) = 0.37,

*p*= 0.73. For color, we similarly found a significant effect of change magnitude,

*β*= 0.041 ± 0.003,

*t*(4) = 6.30,

*p*= 0.003; no significant difference between Condition C and the other conditions,

*β*= −0.050 ± 0.056,

*t*(4) = −0.03,

*p*= 0.98; and no significant difference between one-change Condition D and the other conditions,

*β*= −0.042 ± 0.040,

*t*(4) = −0.42,

*p*= 0.70. The absence of an effect of an (either changing or nonchanging) irrelevant feature on performance is consistent with most change detection studies that did not vary change magnitude (Luria & Vogel, 2011; Shen et al., 2013; Vogel et al., 2001; Xu, 2010). It is inconsistent with (Hyun et al., 2009), and we do not know why.

_{ori}, which represents the precision with which orientation is encoded when irrelevant as a proportion of the precision with which it is encoded when relevant, as 12.9% ± 3.3% (

*t*= 3.88,

*p*= 0.02). We estimated the analogous parameter for color, ρ

_{col}, as 11.6% ± 8.4% (

*t*= 1.38,

*p*= 0.24). Therefore, we cannot state conclusively whether the irrelevant feature is taken into account, but if it is, it is encoded with comparatively low precision and therefore has little effect on the decision.

*N*two-feature objects over 2

*N*single-feature objects decreases the amount of resource in a given feature dimension that is available to encode the

*N*feature values in that dimension. A possible mechanism for such a decrease would be that some amount of resource “leaks” to the

*N*“distractor” objects that do not have that feature dimension but are task-relevant because of their other feature dimension. This question is orthogonal to Q1 and Q2, and independent of the models considered so far. We examined Q3 in Experiment 4, using three conditions that parallel those in Olson and Jiang (2002).

*N*, two separate, and two together-2

*N*sessions (Figure 7A). Each session was run on different days. The order of the sessions was random for each subject. Each session consisted of four blocks of 150 trials. Hence, each subject completed 6 × 4 × 150 = 3,600 trials in total. In the together-

*N*condition, each of the four objects had both orientation and color, and the change had occurred with equal probabilities in either feature (this is the same as Condition B before). In the separate condition, the stimuli were four discs with colors independently drawn from a uniform distribution and four gray ellipses with orientations independently drawn from a uniform distribution, for a total of eight objects. The together-2

*N*condition was identical to the together-

*N*condition except that set size was 8.

*no-leak model*, the entire amount of resource in a given feature dimension is distributed to relevant locations (i.e. to objects that have that feature dimension). In the

*leak model*, only a portion of the resource is distributed to relevant locations.

*N*(identical to Condition B), separate, and together-2

*N*. The first two conditions have

*N*objects, and the last has 2

*N*objects. To fit the data from all conditions simultaneously, we model mean precision as

*N*is set size and

*N*condition as in Condition B. For the separate condition, per feature, the entire amount of memory resource is assigned only to the relevant locations (

*N*= 4). We have eight decision variables: four from orientation and four from color. This is comparable to a change localization task with eight one-feature objects, although here there are four colored objects and four orientation objects. Finally, the model's predictions for the together-2

*N*condition are identical to those for the together-

*N*, except that encoding precision is lower because

*N*= 8. The no-leak model has four parameters:

*τ*

_{ori},

*τ*

_{col}.

*r*is a feature-specific leak parameter (0 <

*r*< 1), and

*N*= 4. The model predictions for the together-

*N*and together-2

*N*conditions are identical to those of the no-leak model. The leak model has six parameters:

*τ*

_{ori},

*τ*

_{col},

*r*

_{ori}, and

*r*

_{col}.

*N*condition than in the separate condition, and higher in the separate condition than in the together-2

*N*condition. For orientation, a two-way repeated-measures ANOVA shows a significant main effect of change magnitude,

*F*(8, 32) = 62.14,

*p*< 0.001; a significant main effect of condition,

*F*(2, 8) = 82.92,

*p*< 0.001; and significant interaction,

*F*(16, 64) = 5.02,

*p*< 0.001. This is consistent for color: respectively,

*F*(8, 32) = 70.13,

*p*< 0.001;

*F*(2, 8) = 82.92,

*p*< 0.001;

*F*(16, 64) = 2.13,

*p*= 0.02. However, this qualitative pattern of results can be predicted both by the leak and the no-leak models. This is easiest to understand from the fact that in the separate condition, there are eight instead of four locations, and therefore chance performance for change localization is lower, even if there is no leak. However, in the leak model, the gap between the together-

*N*and separate conditions is expected to be larger. To make this concrete, we implemented the leak model by postulating that a fixed proportion of a feature's resource is leaked. We find that for every individual subject, the AIC of the no-leak model is higher than of the leak model, on average by 62 ± 17. We estimate that 38 ± 13% of orientation resource and 21 ± 5% of color resource is leaked. However, a model in which we enforce the same leak rate for both features fits nearly equally well compared to the leak model with independent leak per feature (AIC difference: 1.9 ± 1.6) and this leak rate is estimated at 31 ± 4%.

*N*condition. The difference in performance between the separate and together-2

*N*conditions indicates that separate is not simply a special case of a condition with double the set size as in together-

*N*. This confirms that in the separate condition, although feature memory resource might “leak,” it is still far from being equally allocated to all objects regardless of relevance.

*even if encoding precision per stimulus is independent of the number of stimuli*(Eckstein, 1998; Nolte & Jaarsma, 1967; Verghese, 2001). The reason is that the noise from the distractors increasingly “drowns out” the target signal. In the present study, we avoided this pitfall by using appropriately different decision rules in Condition A (single feature) and Condition B (two relevant features) in the models. Further work is needed to distinguish between these possibilities.

*package*for resources, in contrast to Brady et al.'s [2011] term

*bundle*for features). Within the multifeature package, resource remains feature-specific, and if the object does not have one of the features, the corresponding resource from the package ends up being wasted. Finally, the brain utilizes a smart decoder, in the sense that the signal arising from resource being allocated to an irrelevant feature dimension can be ignored during decision-making.

*N*” condition because the

*packaged*resource is distributed over double the number of objects, even though targeted resource is distributed over the same number of objects. This would be consistent with the results of (Fougnie et al., 2010; Lee & Chun, 2001; Olson & Jiang, 2002) and corresponds to an object benefit (Jiang et al., 2000) for the together-

*N*condition. Moreover, precision in the separate condition is higher than in the together-2

*N*condition because the

*targeted*resource is distributed over half the number of objects, even though packaged resource is distributed over the same number of objects. This is consistent with one comparison in Marshall and Bays (2013) and with classic change detection results (Olson & Jiang, 2002). However, it is inconsistent with one comparison in (Marshall & Bays, 2013) that used a condition in which two sequentially presented displays differed by which feature was relevant (e.g., color had to be remembered for the stimuli in the first display, orientation for those in the second). Performance was lower when the stimuli possessed an irrelevant feature in addition to their relevant feature; the discrepancy with our framework might be due to the fact that relevance switched mid-trial, causing targeted resource to be allocated to an irrelevant feature.

*N*conditions in Experiment 4, in which set size was 8. We leave the study of potential binding errors in these conditions for future work.

*. New York: Dover Publications.*

*Handbook of mathematical functions**, 34 (10), 3632–3645.*

*Journal of Neuroscience**, 6: 19203.*

*Scientific Reports**, 9 (10): 7, 1–11, doi:10.1167/9.10.7. [PubMed] [Article]*

*Journal of Vision**, 321 (5890), 851–854.*

*Science**, 23 (2), 353–369.*

*Journal of Experimental Psychology: Human Perception and Performance**, 11 (5): 4, 1–34, doi:10.1167/11.5.4. [PubMed] [Article]*

*Journal of Vision**, 10, 433–436.*

*Spatial Vision**, 42 (11), 1903–1922.*

*Journal of Experimental Psychology: Human Perception and Performance**, 107, 76–85.*

*Vision Research**, 9 (2), 111–118.*

*Psychological Science**Journal of Experimental Psychology*:

*, 39 (3), 611–615.*

*Human Perception and Performance**, 11 (12): 3, 1–12, doi:10.1167.11.12.3. [PubMed] [Article]*

*Journal of Vision**, 10 (12): 27, 1–11, doi:10.1167/10.12.27. [PubMed] [Article]*

*Journal of Vision**, 3, 1229.*

*Nature Communications**, 41 (2), 325–347.*

*Journal of Experimental Psychology: Learning, Memory, and Cognition**, 35 (4), 1140–1160.*

*Journal of Experimental Psychology: Human Perception and Performance**, 26 (3), 683–702.*

*Journal of Experimental Psychology: Learning, Memory, and Cognition**, 7 (6), e40216.*

*PLoS ONE**, 9 (2), e1002927.*

*PLoS Computational Biology**, 12 (3): 13, 1–12, doi:10.1167/12.3.13. [PubMed] [Article]*

*Journal of Vision**, 63 (2), 253–257.*

*Perception & Psychophysics**, 390 (6657), 279–281.*

*Nature**, 17 (8), 391–400.*

*Trends in Cognitive Sciences**, 49 (6), 1632–1639.*

*Neuropsychologia**, 9 (11): 3, 1–30, doi:10.1167/9.11.3. [PubMed] [Article]*

*Journal of Vision**, 17, 347–356.*

*Nature Neuroscience**, 13 (2): 21, 1–13, doi:10.1167/13.2.21. [PubMed] [Article]*

*Journal of Vision**, 7 (1), 141–150.*

*Journal of Experimental Psychology: Human Perception and Performance**M*orthogonal signals.

*, 41 (2), 497–505.*

*Journal of the Acoustical Society of America**, 41, 1212–1227.*

*Memory and Cognition**, 124 (1), 21–59.*

*Psychological Review**, 64 (7), 1055–1067.*

*Perception & Psychophysics**, 14 (2): 2, 1–15, doi:10.1167/14.2.2. [PubMed] [Article]*

*Journal of Vision**. Cambridge, UK: Cambridge University.*

*The effects of visual noise**, 13 (2): 1, 1–11, doi:10.1167/13.2.1. [PubMed] [Article]*

*Journal of Vision**, 16 (5): 10, 1–8, doi:10.1167/16.5.10. [PubMed] [Article]*

*Journal of Vision**, 15 (3): 2, 1–27, doi:10.1167/15.3.2. [PubMed] [Article]*

*Journal of Vision**, 119 (4), 807–830.*

*Psychological Review**, 121 (1), 124–149.*

*Psychological Review**, 109 (22), 8780–8785.*

*Proceedings of the National Academy of Sciences, USA**, 31 (4), 523–535.*

*Neuron**, 27 (1), 92–114.*

*Journal of Experimental Psychology: Human Perception and Performance**, 131 (1), 48–64.*

*Journal of Experimental Psychology General**, 4 (12): 11, 1120–1135, doi:10.1167/4.12.11. [PubMed] [Article]*

*Journal of Vision**, 15 (1), 223–229.*

*Psychonomic Bulletin & Review**, 28 (2), 458–468.*

*Journal of Experimental Psychology: Human Perception and Performance**, 30 (42), 14020–14028.*

*Journal of Neuroscience**, 19 (2), 218–224.*

*Psychonomic Bulletin and Review**, 453 (7192), 233–235.*

*Nature**p*(

*L|*) of change location

**x**,**y***L*given the measurement vectors

*and*

**x***, and choose the location for which this posterior is highest. Denoting by a binary variable*

**y***C*whether the change occurred at the

_{L}*L*th location, we compute the posterior as

*L*th location is proportional to the likelihood ratio of change occurrence at that location considered independently from all other locations. This likelihood ratio is our decision variable and denote it by

*d*:

_{L}**θ**,

**φ**, and in the case of

*C*= 1, also Δ:

_{L}*L*for which

*d*is highest.

_{L}*L*for which

*d*is highest.

_{L}**Appendix B: Replication of**

**Experiment 2**

*N*= 8). Each display contained eight colored ellipses, equally spaced around an imaginary circle. Performance was similar between Conditions B and C (Figure A1). The AIC from Model 5/6 was higher than Model 3 (81 ± 19). This is consistent with Experiment 2.