September 2009
Volume 9, Issue 10
Free
Research Article  |   September 2009
The precision of visual working memory is set by allocation of a shared resource
Author Affiliations
Journal of Vision September 2009, Vol.9, 7. doi:https://doi.org/10.1167/9.10.7
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Paul M. Bays, Raquel F. G. Catalao, Masud Husain; The precision of visual working memory is set by allocation of a shared resource. Journal of Vision 2009;9(10):7. https://doi.org/10.1167/9.10.7.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The mechanisms underlying visual working memory have recently become controversial. One account proposes a small number of memory “slots,” each capable of storing a single visual object with fixed precision. A contrary view holds that working memory is a shared resource, with no upper limit on the number of items stored; instead, the more items that are held in memory, the less precisely each can be recalled. Recent findings from a color report task have been taken as crucial new evidence in favor of the slot model. However, while this task has previously been thought of as a simple test of memory for color, here we show that performance also critically depends on memory for location. When errors in memory are considered for both color and location, performance on this task is in fact well explained by the resource model. These results demonstrate that visual working memory consists of a common resource distributed dynamically across the visual scene, with no need to invoke an upper limit on the number of objects represented.

Introduction
Understanding the nature of visual working memory—sometimes referred to as visual short term memory (VSTM)—is fundamental to understanding aspects of visual perception (O'Regan, 2001; Simons & Rensink, 2005), attention (Awh & Jonides, 2001; Bundesen & Habekost, 2008; de Fockert, Rees, Frith, & Lavie, 2001; Lepsien & Nobre, 2007; Soto & Humphreys, 2006), and integration of visual information across eye movements (Henderson, 2008; Irwin, 1991). Deficits in visual working memory have also been linked to damage to parietal and frontal brain regions and are associated with disorders of visual perception and attention (D'Esposito & Postle, 1999; Logie & Della Sala, 2005; Mannan et al., 2005; Müller & Knight, 2006). Thus, the mechanisms underlying visual working memory are central to understanding several key brain functions and their disorders. 
A long-standing model of visual working memory holds that three or four independent memory “slots” each store information about a single visual item (Cowan, 2005; Luck & Vogel, 1997; Pashler, 1988; Vogel, Woodman, & Luck, 2001). However, this assumption of independent storage has recently been challenged by studies examining the precision with which items are recalled. This new approach has revealed that the resolution with which a visual item is maintained depends critically on how many other items are concurrently held in memory (Alvarez & Cavanagh, 2004; Awh, Barton, & Vogel, 2007; Bays & Husain, 2008; Wilken & Ma, 2004), such that increasing numbers of objects are stored with increasing variability (“noise”). For a simple “slot” model, in contrast, the number of items to be remembered should not influence performance until the capacity limit is exceeded. 
An alternative account proposes that a single memory resource must be shared out between visual items. According to this hypothesis, the precision with which an item is stored is determined by the fraction of total resources allocated to it (Bays & Husain, 2008). As more items are stored, less resource is available per item, so the resolution with which each object is stored decreases. Unlike the slot model, this resource model does not predict any fixed upper limit on the number of items stored. Indeed, the resource model successfully predicts the appearance of a capacity limit in change detection tasks, which was previously taken as evidence for a fixed number of slots. Moreover, by allowing the resource to be flexibly distributed between items, it can also explain how visually salient items, or those that are the targets of forthcoming eye movements, are remembered with enhanced precision (Bays & Husain, 2008). 
However, in a recent study, Zhang and Luck (2008) presented results from a color report task (previously described by Wilken & Ma, 2004), which appear to provide important new evidence in favor of the fixed slot model. They observed responses on the report task that could not be explained by simple variability in memory for color. In their analysis, these apparently random responses were interpreted as evidence for a fixed upper limit on the number of items that can simultaneously be held in visual working memory. According to the authors, the error distribution on such a task comprises a mixture of two components: a Gaussian centered on the correct color of the probed item and a uniform distribution (due to “guessing”) spread equally over all possible responses. Zhang and Luck propose that this latter random component corresponds to a proportion of trials on which no information is stored about the target color, as the result of exceeding an upper limit on the number of items that can be maintained. 
This paradigm provides a crucial test for slot versus resource models of working memory. Here we examine the requirements of the report task in detail. While previously considered a simple test of memory for color, we show that performance also depends critically on memory for object locations. Because locations, like colors, are stored with error, the resource model in fact predicts the “random” responses observed by Zhang and Luck (2008), without any need to invoke an upper limit on items stored. We demonstrate a further component of error on the task related to the duration of exposure to the stimulus, revealing a separate performance limit that may reflect a maximum rate at which items can be encoded into memory. These findings challenge the view that a fixed number of object representations underlies visual working memory. 
Methods
Experimental protocol
Twelve subjects (seven male, five female; age 18–28 years) participated in the study after giving informed consent. All subjects reported normal color vision and had normal or corrected-to-normal visual acuity. Stimuli were displayed on a 21-in. CRT monitor at a viewing distance of 60 cm. Eye position was monitored online at 1000 Hz using a frame-mounted infra-red eye tracker (SR Research Ltd., Canada). The design of the experiment was identical to that described in Zhang and Luck (2008) with the following modifications: fixation was monitored; we chose a more evenly spaced set of array sizes; and we tested performance at a range of different durations of the sample array. 
Each trial began with the presentation of a central fixation cross (white, 0.75° diameter) against a gray background. Once a stable fixation was recorded within 2° of the cross, a sample array was presented, consisting of 1, 2, 4, or 6 colored squares (2° × 2°). Each color was independently chosen at random from a color wheel comprising a circular subset of the CIE L*a*b* color space (for full details, see Zhang & Luck, 2008). Each square was randomly positioned at one of eight possible locations on an invisible circle, radius 4.5°, centered on the fixation cross. The sample array was presented for 100, 500, or 2000 ms, followed by a delay period of 900 ms in which the display was blank except for the fixation cross. A test array was then presented containing the color wheel (randomly rotated) and an outlined square at the location of each item from the sample array. One target location was indicated by a thicker outline, and subjects were instructed to report the color they remembered seeing at that location by using a computer mouse to select a point on the color wheel. 
Each subject completed a total of 600 trials. The four different array sizes were tested in separate blocks of trials, with the order of completion randomized between subjects. Each block consisted of 50 trials at each of the three different display times, presented in a randomized sequence. Trials were repeated if gaze deviated more than 2° from the central cross during presentation of the sample array. 
Analysis
A measure of error was obtained on each trial by calculating the angular deviation on the color wheel between the color reported by the subject and the correct target color. For each combination of subject and array size, we calculated precision as the reciprocal of the standard deviation of the error, as in Bays and Husain (2008). Because the tested parameter space was circular, we used the definition of standard deviation for circular data given by Fisher (1993) and subtracted from the precision estimate the value expected by chance (i.e., if the subject had responded at random on each trial). 
In previous results (Bays & Husain, 2008), a power law was found to accurately capture the relationship between the precision with which an item is stored (P) and the fraction of memory resources available to store it (R). We fit the same model to the results from the current task: PRλ, where R = 1/N is simply the reciprocal of the number of items in the sample array. The curve shown in Figure 1b corresponds to the mean parameters obtained by a non-linear least squares fit to each subject's data. This estimate of precision simply reflects the degree of variability in subjects' responses and hence is agnostic as to the distribution and source of these errors. 
Figure 1
 
Precision of visual working memory in a color report task. (a) Subjects were briefly presented with a sample array of 1–6 colored squares; exposure duration was varied across trials (100–2000 ms). After a blank period (900 ms), a test array was presented in which the location of a randomly selected sample item was highlighted. Subjects reported the remembered color corresponding to the highlighted location by clicking on a color wheel. (b) Precision as a function of the number of items in the sample array (N). Precision is defined as the reciprocal of the standard deviation of the error in subjects' responses: zero indicates chance performance. Error bars indicate SEM. The blue line indicates the best fit to the data of a power law relating precision to the fraction of resources available per item (1/N). (c) Three models for the distribution of responses on the color report task, illustrated for a single trial with a sample array of two items (one red, one green) and a test array that cues the location of the red item. Variability in memory for color alone would predict a Gaussian distribution of responses centered on the actual color at the target location (top). In the model proposed by Zhang and Luck (2008) (middle), a proportion of responses instead come from a uniform distribution in which colors are chosen at random (shown in green). Alternatively (bottom), variability in memory for location may cause subjects to mistake which item was at the target location on some trials, in which case a proportion of responses (shown in green) will come from a Gaussian centered on the non-target color.
Figure 1
 
Precision of visual working memory in a color report task. (a) Subjects were briefly presented with a sample array of 1–6 colored squares; exposure duration was varied across trials (100–2000 ms). After a blank period (900 ms), a test array was presented in which the location of a randomly selected sample item was highlighted. Subjects reported the remembered color corresponding to the highlighted location by clicking on a color wheel. (b) Precision as a function of the number of items in the sample array (N). Precision is defined as the reciprocal of the standard deviation of the error in subjects' responses: zero indicates chance performance. Error bars indicate SEM. The blue line indicates the best fit to the data of a power law relating precision to the fraction of resources available per item (1/N). (c) Three models for the distribution of responses on the color report task, illustrated for a single trial with a sample array of two items (one red, one green) and a test array that cues the location of the red item. Variability in memory for color alone would predict a Gaussian distribution of responses centered on the actual color at the target location (top). In the model proposed by Zhang and Luck (2008) (middle), a proportion of responses instead come from a uniform distribution in which colors are chosen at random (shown in green). Alternatively (bottom), variability in memory for location may cause subjects to mistake which item was at the target location on some trials, in which case a proportion of responses (shown in green) will come from a Gaussian centered on the non-target color.
A probabilistic model of performance on this task has previously been proposed (Zhang & Luck, 2008) in which there are two possible sources of error on each trial: Gaussian variability in memory for the target color and a fixed probability of simply guessing at random. This model can be described as follows: 
p(^θ)=(1γ)ϕσ(^θθ)+γ12π
(1)
where θ is the target color value (in radians),
^θ
is the reported color value, and γ is the proportion of trials on which the subject responds at random. ϕσ denotes the circular analogue of the Gaussian distribution (the Von Mises distribution) with mean of zero and standard deviation σ
In this study, we propose an additional source of error: a certain probability on each trial of misremembering which item was at the probed location. On these trials, responses are drawn from a Gaussian distribution centered on the color value of one of the non-target items. The standard deviation of this Gaussian will be the same as for responses to the target item: target and non-target colors will on average be stored with the same precision because it is not known at the time of encoding which item will become the target. To assess the contribution of this additional source of error to subjects' responses, we added a third component to the model:  
p ( ^ θ ) = ( 1 γ β ) ϕ σ ( ^ θ θ ) + γ 1 2 π + β 1 m i m ϕ σ ( ^ θ θ i * )
(2)
where β is the probability of misremembering the target location and { θ* 1, θ* 2,… θ* m} are the color values of the m non-target items. Maximum likelihood estimates of the parameters σ, β, and γ were obtained separately for each subject and experimental condition using a non-linear optimization algorithm (Nelder & Mead, 1965). The optimization procedure was repeated from a range of different initial parameter values to ensure that global maxima were obtained. For comparison purposes, we also fit Zhang and Luck's two-component model (Equation 1 above) to our data using the same procedure. 
Hypotheses regarding the effects of experimental parameters (array size, exposure duration) on the different components of the model were tested by ANOVA and t-tests on the maximum likelihood parameters obtained for each subject and condition. 
Results
The color report task is illustrated in Figure 1a. On each trial, a subject is briefly presented with an array of colored squares surrounding a central fixation point. After a short blank period, one array location is highlighted and the subject must report the color that was at that location (the target color) by clicking at the appropriate position on a color wheel. The angular deviation on the wheel between the selected and the target color values is taken as a measure of the error in the subject's memory for the sample display. 
The overall pattern of performance we observed on this task ( Figure 1b) reveals that the precision with which each item is recalled falls significantly with increases in the number of objects presented ( t (11) > 5.5, p < 0.001). Note that performance falls even when the number of items increases from one to two, inconsistent with a model in which each item is stored in a separate “slot.” The observed rate of decline in precision (falling by 49% between one and two items) is also significantly greater ( t (11) = 5.0, p < 0.001) than predicted by averaging of multiple slots storing the same item (29%; see Zhang & Luck, 2008). Performance remained above chance (indicated by zero precision in Figure 1b) at all sample sizes tested (t(11) > 8.9, p < 0.001) and for every individual subject. 
The results are accurately captured by a power law relating precision to the proportion of resources available per item ( Figure 1b, blue line; see Methods). These findings are consistent with a resource model in which the precision with which a visual item is remembered depends on the fraction of total working memory resources allocated to its storage. So, as the number of items in the display increases, the precision with which any individual item is remembered will decrease. The power function is also consistent with previous results for memory of visual locations and orientations (Bays & Husain, 2008). 
Zhang and Luck (2008) did not show the overall performance of subjects in this way. Instead they presented their data in terms of two distinct components that might underlie errors on this task. The first component corresponds to errors in the internal representation of a stored color. This internal error should be distributed as a Gaussian, with decreasing precision corresponding to an increase in the Gaussian width (Dayan & Abbott, 2001; Seung & Sompolinsky, 1993; Vogels, 1990). Assuming the sequence of colors on the color wheel is approximately isomorphic with the internal representation of color space, we would expect a similar distribution to be found in subjects' responses on the color task (as illustrated in Figure 1c, top). 
Figure 2a shows the distribution of errors on our task obtained for sample arrays of one to six items. When only a single item has to be remembered (far left), responses indeed take on an approximately Gaussian distribution. However, as Zhang and Luck (2008) observed, this simple description does not fully capture responses on this task when there are multiple items in the sample array. While the central portion of the distribution remains approximately Gaussian for larger sample sizes (e.g., six items, far right), the probability of the largest errors does not fall to zero as expected. To account for this, Zhang and Luck invoked a second “guessing” component, corresponding to a proportion of trials on which no information is stored about the target color, as a result of exceeding an upper limit on the number of items that can be stored. Thus, they modeled their data as a Gaussian centered on the correct color of the target item (shown in blue in Figure 1c, middle panel) and a uniform (guessing) distribution spread equally over all possible responses (green in Figure 1c). 
Figure 2
 
Distribution of errors relative to target and non-target colors. (a) Frequency of response as a function of the difference between reported color value and target color value, for varying numbers of items (N) in the sample array. The long tails of the distribution observed for larger sample sizes (e.g., six items, far right) are inconsistent with the simple Gaussian model shown in Figure 1c, top, but are consistent with either of the other models shown in Figure 1c. Colored lines indicate the response probabilities predicted by a mixture model combining color error, location error, and random components (see main text and Figure 3). (b) Frequency of responses as a function of the difference between reported color value and each non-target color value. The strong central tendency observed for larger numbers of items (N) is not predicted by Zhang and Luck's (2008) model (Figure 1c, middle) but is consistent with errors in memory for location, as illustrated in (Figure 1c, bottom). Colored line indicates the prediction of the three-component model. Error bars indicate SEM.
Figure 2
 
Distribution of errors relative to target and non-target colors. (a) Frequency of response as a function of the difference between reported color value and target color value, for varying numbers of items (N) in the sample array. The long tails of the distribution observed for larger sample sizes (e.g., six items, far right) are inconsistent with the simple Gaussian model shown in Figure 1c, top, but are consistent with either of the other models shown in Figure 1c. Colored lines indicate the response probabilities predicted by a mixture model combining color error, location error, and random components (see main text and Figure 3). (b) Frequency of responses as a function of the difference between reported color value and each non-target color value. The strong central tendency observed for larger numbers of items (N) is not predicted by Zhang and Luck's (2008) model (Figure 1c, middle) but is consistent with errors in memory for location, as illustrated in (Figure 1c, bottom). Colored line indicates the prediction of the three-component model. Error bars indicate SEM.
However, this interpretation overlooks a crucial aspect of the color report task. Because a subject's response is cued by indicating the previous position of one of the array items, the task requires subjects to remember not only the color of each item in the sample array, but also its location. A resource model predicts error in the stored representations of both color and position. When the probe is presented, subjects must compare its location with the location of each array item held in memory to determine which item's color they should report. Errors in memory for item locations will therefore result in subjects incorrectly responding with the remembered color of one of the non-target items. 
The consequences of this are illustrated in Figure 1c (bottom), for the simple case in which two items are presented. The uncertainty in color will again result in a Gaussian distribution of error (shown in blue) centered on the target color value. The uncertainty in location will result, on a certain proportion of trials, in responses corresponding to the remembered color of the other, non-target item: these responses will be described by a second Gaussian distribution of responses (green) centered on the non-target color. 
Because the array colors are selected at random, all relative positions of target and non-target colors on the color wheel are equally likely. Therefore, if error is calculated relative to the target color on each trial (as in Figure 2a; Zhang & Luck, 2008), responses due to errors in memory for location will be scattered evenly across the color wheel, resulting in an error distribution indistinguishable from the one predicted by Zhang and Luck's (2008) model (Figure 1c, middle). However, the two alternatives can easily be distinguished by instead calculating the frequency of responses relative to the non-target color values on each trial. Because non-target and target color values are uncorrelated, Zhang and Luck's model predicts that this distribution will be uniform. 
Figure 2b shows the results of this analysis on our task: as predicted by the resource model, responses centered on the color values of non-target items are more frequent than expected by chance, and this central tendency becomes more pronounced as the number of items in the sample array increases (two items, t (11) = 0.74, p = 0.47; four items, t (11) = 2.7, p = 0.02; six items, t (11) = 5.9, p < 0.001). Clearly, therefore, errors in identifying which item's color should be reported influence performance on this task. 
The existence of these non-target responses may provide an alternative account for the non-Gaussian distribution of error observed in Figure 2a, without implying any upper limit on the number of items that can be stored in memory. To provide a fair comparison of the two alternatives, we fit data from the color report task with a mixture model which decomposed the response distribution into three components (illustrated in Figures 3a3c). The first component corresponded to errors in memory for color and consisted of a Gaussian centered on the target color value ( Figure 3a). The second component captured errors in memory for location and consisted of a Gaussian centered on each non-target color ( Figure 3b). The final component consisted of a uniform distribution (Figure 3c), corresponding to the probability of responding at random, as in Zhang and Luck's (2008) model. 
Figure 3
 
Three sources of error in the report task and the effect of sample duration. (a–c) Subject responses on the memory task were decomposed into three separate components, indicated by the shaded regions: (a) a Gaussian distribution with standard deviation σ centered on the target color value (T), corresponding to error in memory for color; (b) Gaussian distributions with the same width centered on each non-target color value (NT), corresponding to errors in memory for location; (c) a uniform distribution, capturing random responses unrelated to any of the sample colors. (d–f) Maximum likelihood parameters of the three-component model, as a function of number of items in the sample array (mean across sample durations). (d) The standard deviation (σ) increases with array size, indicating increasing variability in memory for color; (e) the proportion of responses corresponding to non-targets increases with array size, indicating increasing variability in memory for location; (f) the proportion of random responses is shown in black; the gray dashed lines indicate the proportions of random responses expected for a fixed upper limit of 2, 3, or 4 items. (g–i) Effect of sample duration on each parameter of the model: light gray symbols and dotted line, 100 ms; dark gray symbols and dashed line, 500 ms; black symbols and solid line, 2 s. Sample duration does not effect variability in memory for color (g) or location (h) but has a substantial effect on the frequency of random responses (i). Error bars in this figure indicate within-subject SEM, as in Zhang and Luck (2008).
Figure 3
 
Three sources of error in the report task and the effect of sample duration. (a–c) Subject responses on the memory task were decomposed into three separate components, indicated by the shaded regions: (a) a Gaussian distribution with standard deviation σ centered on the target color value (T), corresponding to error in memory for color; (b) Gaussian distributions with the same width centered on each non-target color value (NT), corresponding to errors in memory for location; (c) a uniform distribution, capturing random responses unrelated to any of the sample colors. (d–f) Maximum likelihood parameters of the three-component model, as a function of number of items in the sample array (mean across sample durations). (d) The standard deviation (σ) increases with array size, indicating increasing variability in memory for color; (e) the proportion of responses corresponding to non-targets increases with array size, indicating increasing variability in memory for location; (f) the proportion of random responses is shown in black; the gray dashed lines indicate the proportions of random responses expected for a fixed upper limit of 2, 3, or 4 items. (g–i) Effect of sample duration on each parameter of the model: light gray symbols and dotted line, 100 ms; dark gray symbols and dashed line, 500 ms; black symbols and solid line, 2 s. Sample duration does not effect variability in memory for color (g) or location (h) but has a substantial effect on the frequency of random responses (i). Error bars in this figure indicate within-subject SEM, as in Zhang and Luck (2008).
The results of this analysis are shown in Figures 3d3f. As the number of items in the sample array increased from one to six, the standard deviation of the error distribution centered on the target color increased monotonically ( Figure 3d; F (3,33) = 18.3, p < 0.001). This indicates a decrease in the precision with which each item's color was stored, consistent with the predictions of the resource model. 
The resource model also predicts that error in memory for item locations will increase with increasing number of items. In agreement with this prediction, the proportion of responses captured by the non-target component increased monotonically with array size ( Figure 3e; F (3,33) = 50.9, p < 0.001). 
The remaining uniform component ( Figure 3f), corresponding to random responses, represents a much smaller proportion of trials than in Zhang and Luck's (2008) analysis. For six-item arrays, for example, Zhang and Luck estimated that random guesses made up 62% of responses (the corresponding result applying their analysis to our data was 48%), but in the current analysis only 14% of responses are explained by the uniform component. This indicates that a substantial proportion of the responses Zhang and Luck attributed to random guessing were in fact instances of subjects' misremembering which item was at the target location. In comparison to the three-component model, Zhang and Luck's model significantly overestimated the frequency of random responding at all set sizes where non-targets were present (t(11) > 5.8, p < 0.001). Furthermore, the actual frequency of random responses observed here is not consistent with an upper limit on the number of items stored. 
To illustrate this last point, the predicted frequencies of guessing based on an upper limit of two, three, or four items are illustrated by the dashed lines in Figure 3f. Zhang and Luck's (2008) model predicts that, for array sizes up to and including the limit, all items should be stored in memory and subjects should produce no random responses. Instead we observed a highly significant increase in the random component as a result of the change from one to two items (from 1% to 5%; t(11) = 3.1, p = 0.009). Once an upper limit on number of items has been exceeded, there should be a rapid increase in the frequency of guessing; the opposite result was observed: the uniform component appeared to saturate at about four items (16% versus 14% for six items; t(11) = 0.7, p = 0.50). 
An alternative explanation for this small proportion of random responses is revealed by examining the effects of sample exposure time. Zhang and Luck (2008) presented each sample array of colored squares for only 100 ms: conceivably not enough time to fully encode all the visual information in the array into memory. To test this possibility, in the current study we parametrically varied the sample display time between 100 ms and 2 s: the consequences of this manipulation for each of the three model components are shown in Figures 3g3i
Varying sample time had no consistent effect on errors in memory for color ( Figure 3g; F (2,22) = 1.7, p = 0.21) or for location ( Figure 3h; F (2,22) = 0.2, p = 0.86). However, increasing the duration of the sample array dramatically decreased the frequency of random responses ( Figure 3i; F (2,22) = 6.1, p = 0.008). This strongly suggests that the tendency for subjects to respond at random on a small proportion of trials is the result of incomplete encoding of items into memory rather than an upper limit on how many items can be stored. At the longest presentation time tested (2 s; black symbols and solid line in Figure 3i), random responses made up on average less than 6% of the total response distribution, indicating that performance on the task could be accounted for almost entirely by the combination of variability in color and position predicted by a resource model. 
Discussion
The recent controversy surrounding the nature of working memory has focused on what the precision of recall reveals about the mechanisms underlying memory (Alvarez & Cavanagh, 2004; Awh et al., 2007; Bays & Husain, 2008, 2009; Cowan & Rouder, 2009; Wilken & Ma, 2004; Zhang & Luck, 2008). In this study, we examined performance on a task in which subjects were asked to recall the color of an object displayed at a specified location. Our findings show that the precision with which subjects report this color declines with increasing number of objects in the memory array. This finding is consistent with a model of visual working memory in which a common resource must be shared out between all items in the display (Bays & Husain, 2008). In this model, the precision with which an item is stored depends on the fraction of the total resource allocated to its storage. Because observers do not know which item will be probed when they view an array, the resource will, on average, be shared out equally among all items; hence, performance declines with increasing number of items. 
Contrary to this view, performance on this same task has recently been put forward as strong evidence for the existence of a fixed number of discrete object representations or “slots” in visual working memory (Zhang & Luck, 2008). The observed effect of the number of objects on precision—and in particular the large difference in precision between one and two item arrays (Figure 1b)—cannot be reconciled with the traditional model in which each item is stored in a separate slot (Cowan, 2005; Luck & Vogel, 1997; Pashler, 1988; Vogel et al., 2001). Instead, Zhang and Luck (2008) propose a modification to the original slot model whereby slots can “double up” and store the same item, combined with an averaging process to obtain a single estimate per item. This modification allows the slot model to behave like a quantized version of the resource model and hence exhibit the same dependence of precision on the number of items stored, albeit at substantial cost to the parsimony and conceptual power of the original model. One might question the utility of the slot concept if it must be modified so that there is now no longer a one-to-one correspondence between a slot and a visual object that is represented. 
Despite making many similar predictions for behavior, this modified slot model remains fundamentally different from the resource model and has radically different implications for how the brain solves the problem of storing visual information. Understanding the nature of visual short term memory is crucial to understanding how observers perceive the world (O'Regan, 2001; Simons & Rensink, 2005), deploy attention to visual items (Awh & Jonides, 2001; Bundesen & Habekost, 2008; de Fockert et al., 2001; Lepsien & Nobre, 2007; Soto & Humphreys, 2006), or dynamically acquire information about a scene from glimpses obtained between eye movements (Henderson, 2008; Irwin, 1991). The color report task provides a key paradigm to consider and test these opposing views. 
One crucial distinction that is retained by Zhang and Luck's (2008) modified scheme is that the slot model, unlike the resource model, predicts a fixed upper limit on the number of items that can be simultaneously held in memory. In their analysis of the color report task, Zhang and Luck considered responses that could not be explained by simple Gaussian variability in memory for the target color (Figure 1c, top) to be due to random guesses (Figure 1c, middle). These random responses were interpreted as evidence for just such an upper limit on the number of items stored. According to this interpretation, random responses occur on trials where no information is stored about the probed item because the number of array items exceeds the maximum number of items that can be stored. As substantial numbers of these responses are observed even with array sizes as small as three items (Zhang & Luck, 2008, 2009), this interpretation implies that the average capacity limit is about two. 
However, one critical factor that has previously been overlooked on this task (Wilken & Ma, 2004; Zhang & Luck, 2008, 2009) is the need for subjects to remember the locations of the array items as well as their color. Subjects are instructed to report the color of only one of the items held in memory: the item that matched the location of the probe. Therefore, subjects must compare the probe location with the remembered location of each array item to determine which color to report. The resource model predicts that locations stored in working memory will be corrupted by noise, in the same way as colors. Therefore, observers will sometimes incorrectly identify which item was at the probed location and mistakenly report the remembered color of one of the non-probed items (Figure 1c, bottom). 
Our analysis confirms that subjects are more likely to be biased in their responses by the colors of non-probed items than by chance alone ( Figure 2b). Importantly, when responses to the non-targets are taken into account, we have shown that the majority of responses Zhang and Luck (2008) interpreted as random guesses are in fact due to errors in memory for location, as predicted by a resource model (Figure 3). 
The resource model proposes that the precision with which an item is stored is determined by the fraction of total memory resources allocated to it. This may have a very simple neural interpretation in terms of population coding: because there is substantial noise in the activity of any individual neuron, the precision of the population estimate of a sensory feature is determined by the number of neurons involved in encoding it (Dayan & Abbott, 2001; Seung & Sompolinsky, 1993; Vogels, 1990). The tuning-curve properties of neurons do not allow a single cell to simultaneously encode two different feature values; therefore, the distribution of a common memory resource in this model may, at the simplest level, correspond to the assignment of a finite pool of memory neurons to encode the different feature values in a scene. An alternative proposal, which makes very similar predictions, is that the resource corresponds to a limit on the total number of spikes expended maintaining a scene in memory (Ma & Huang, 2009). 
Previous results suggest that visual features on different dimensions do not compete for representation in working memory (Luck & Vogel, 1997; Wheeler & Treisman, 2002), so we predict that storage of colors and locations will depend on separate resources. Nonetheless, as the number of items stored in memory increases, the resource model predicts that error will increase in the stored representations of both color and location. This was indeed observed: both variability in memory for color and frequency of errors due to memory for location increased with increasing array size (Figure 3). 
An additional source of error that may also contribute to the non-target responses is “misbinding” (Robertson, 2003; Treisman, 1998; Treisman & Schmidt, 1982; Wolfe & Cave, 1999) in which, for example, the colors of two items become inadvertently switched in memory. In this situation, even if the subject correctly identifies which item was at the probed location, he or she will still respond with one of the non-target colors. Misbinding, in healthy people, has generally been observed only with very brief presentations (e.g., Treisman & Schmidt, 1982), implying that it is an error of encoding rather than memory, in which case we do not expect these errors to contribute substantially to our results at any but the shortest exposures. However, even if some of the responses Zhang and Luck (2008) viewed as random are in fact due to misbinding rather than location errors, this does not support the interpretation that some items have not been stored, and so is equally inconsistent with a slot model. 
A small proportion of apparently random responses could not be explained by either uncertainty in color or location. However, the frequency of these unexplained responses proved highly dependent on the presentation duration of the memory array ( Figure 3i). This suggests that these errors occurred when the exposure time was too short for all the visual information in the array to be encoded into working memory (Bundesen, 1998). While previous studies have observed no advantage of increasing array duration above 100 ms for unmasked displays (Luck & Vogel, 1997; Vogel et al., 2001), these tests were based on detection of supra-threshold changes in color and were therefore insensitive to the precision with which items were stored. 
The encoding errors observed in this study showed a dependence on the number of items in the array, suggesting that individual items or features must compete for entry into memory. This finding is consistent with previous change detection results using brief masked displays (Vogel, Woodman, & Luck, 2006; Woodman & Vogel, 2005). Competition may simply result from the need to serially allocate attention to each item in order to encode it into memory (Desimone & Duncan, 1995; Treisman, 1998): if multiple items are presented very briefly, some may not have been attended by the time the display is blanked. Alternatively, encoding may depend on a resource-limited parallel process similar to the one proposed here for storage. 
At the longest exposure times, encoding errors were minimal and the distribution of responses was explained by a combination of errors in memory for color and location, as predicted by the resource model. We conclude that the high frequency of “guessing” reported on this task by Zhang and Luck (2008), and taken to indicate an upper limit on storage, was in fact the result of two factors. First, very brief presentation of the memory array may have led to incomplete encoding of some items, independent of errors in storage. Second, Zhang and Luck's analysis considered only variability in the response feature (color) and overlooked the possibility of errors in the feature by which responses were cued (location). 
In summary, we have found no evidence to support a fixed upper limit on the number of visual items that can be held in working memory, despite examining the same task previously used to argue for a “slot” model (Zhang & Luck, 2008). Our findings are equally inconsistent with the several “hybrid” models that have been proposed (Alvarez & Cavanagh, 2004; Awh et al., 2007; Xu & Chun, 2005) in which a fixed upper limit of three or four items coexists with a variable limit on total “information load” or object complexity. These models similarly predict a rapid increase in random responses once the upper limit is exceeded, a prediction that is incompatible with the current results. Instead, performance on the color report task is best explained in terms of a common working memory resource that must be distributed increasingly finely as the number of visual items increases. 
The symmetry and simplicity of the memory arrays makes equal distribution of resources to each item the most likely strategy on this task. However, resources can be allocated more flexibly: in a task where attention was drawn to one item in an array by a flash, memory resources were preferentially allocated to enhance representation of the salient item—at the cost of reducing the resolution with which other items were stored (Bays & Husain, 2008). Outside of the laboratory, the complexity of natural scenes is likely to preclude an even distribution of resources, and resource allocation may similarly prioritize storage of salient or goal-relevant visual objects (Itti & Koch, 2001). 
Acknowledgments
We thank R. Sternschein for assistance with data collection. This research was supported by the Wellcome Trust and the National Institute for Health Research Clinical Biomedical Centre at University College London Hospitals/University College London. 
Commercial relationships: none. 
Corresponding author: Dr. Paul Bays. 
Email: p.bays@ion.ucl.ac.uk. 
Address: UCL Institute of Cognitive Neuroscience, 17 Queen Square, London WC1N 3AR, UK. 
References
Alvarez, G. A. Cavanagh, P. (2004). The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychological Science: A Journal of the American Psychological Society, 15, 106–111. [PubMed] [CrossRef]
Awh, E. Barton, B. Vogel, E. K. (2007). Visual working memory represents a fixed number of items regardless of complexity. Psychological Science: A Journal of the American Psychological Society, 18, 622–628. [PubMed] [CrossRef]
Awh, E. Jonides, J. (2001). Overlapping mechanisms of attention and spatial working memory. Trends in Cognitive Sciences, 5, 119–126. [PubMed] [Article] [CrossRef] [PubMed]
Bays, P. M. Husain, M. (2008). Dynamic shifts of limited working memory resources in human vision. Science, 321, 851–854. [PubMed] [Article] [CrossRef] [PubMed]
Bays, P. M. Husain, M. (2009). Response to comment on “Dynamic shifts of limited working memory resources in human vision”. Science, 323, 877 [CrossRef] [PubMed]
Bundesen, C. (1998). A computational theory of visual attention. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 353, 1271–1281. [PubMed] [Article] [CrossRef]
Bundesen, C. Habekost, T. (2008). Principles of visual attention: Linking mind and brain. Oxford: Oxford University Press.
Cowan, N. (2005). Working memory capacity. New York: Psychology Press.
Cowan, N. Rouder, J. N. (2009). Comment on “Dynamic shifts of limited working memory resources in human vision”. Science, 323, 877 [CrossRef] [PubMed]
D'Esposito, M. Postle, B. R. (1999). The dependence of span and delayed-response performance on prefrontal cortex. Neuropsychologia, 37, 1303–1315. [PubMed] [Article] [CrossRef] [PubMed]
Dayan, P. Abbott, L. F. (2001). Theoretical neuroscience: Computational and mathematical modeling of neural systems. Cambridge, MA: MIT Press.
de Fockert, J. W. Rees, G. Frith, C. D. Lavie, N. (2001). The role of working memory in visual selective attention. Science, 291, 1803–1806. [PubMed] [CrossRef] [PubMed]
Desimone, R. Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222. [PubMed] [CrossRef] [PubMed]
Fisher, N. I. (1993). Statistical analysis of circular data. Cambridge, UK: Cambridge University Press.
Henderson, J. M. (2008). Eye movements and visual memory. Visual Memory. (pp. 87–121). Oxford: Oxford University Press.
Irwin, D. E. (1991). Information integration across saccadic eye movements. Cognitive Psychology, 23, 420–456. [PubMed] [CrossRef] [PubMed]
Itti, L. Koch, C. (2001). Computational modelling of visual attention. Nature Reviews, Neuroscience, 2, 194–204. [PubMed] [CrossRef]
Lepsien, J. Nobre, A. C. (2007). Attentional modulation of object representations in working memory. Cerebral Cortex, 17, 2072–2083. [PubMed] [Article] [CrossRef] [PubMed]
Logie, R. H. Della Sala, S. (2005). Disorders of visuospatial working memory. The Cambridge Handbook of Visuospatial Thinking. (pp. 81–120). Cambridge, UK: Cambridge University Press.
Luck, S. J. Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279–281. [PubMed] [CrossRef] [PubMed]
Ma, W. J. Huang, W. (2009). Unlimited-capacity, metabolically constrained visual memory in multiple-object tracking. Frontiers in systems neuroscience. Conference Abstract: Computational and systems neuroscience, doi:10.3389/conf.neuro.06.2009.03.107.
Mannan, S. K. Mort, D. J. Hodgson, T. L. Driver, J. Kennard, C. Husain, M. (2005). Revisiting previously searched locations in visual neglect: Role of right parietal and frontal lesions in misjudging old locations as new. Journal of Cognitive Neuroscience, 17, 340–354. [PubMed] [CrossRef] [PubMed]
Müller, N. G. Knight, R. T. (2006). The functional neuroanatomy of working memory: Contributions of human brain lesion studies. Neuroscience, 139, 51–58. [PubMed] [Article] [CrossRef] [PubMed]
Nelder, J. A. Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7, 308–313. [CrossRef]
O'Regan, J. K. (2001). Change blindness: In encyclopedia of cognitive science. New York: Nature Publishing Group.
Pashler, H. (1988). Familiarity and visual change detection. Perception & Psychophysics, 44, 369–378. [PubMed] [CrossRef] [PubMed]
Robertson, L. C. (2003). Binding, spatial attention and perceptual awareness. Nature Reviews, Neuroscience, 4, 93–102. [PubMed] [CrossRef]
Seung, H. Sompolinsky, H. (1993). Simple models for reading neuronal population codes. Proceedings of the National Academy of Sciences of the United States of America, 90, 10749–10753. [PubMed] [Article] [CrossRef] [PubMed]
Simons, D. J. Rensink, R. A. (2005). Change blindness: Past, present, and future. Trends in Cognitive Sciences, 9, 16–20. [PubMed] [CrossRef] [PubMed]
Soto, D. Humphreys, G. W. (2006). Seeing the content of the mind: Enhanced awareness through working memory in patients with visual extinction. Proceedings of the National Academy of Sciences of the United States of America, 103, 4789–4792. [PubMed] [Article] [CrossRef] [PubMed]
Treisman, A. (1998). Feature binding, attention and object perception. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 353, 1295–1306. [PubMed] [Article] [CrossRef]
Treisman, A. Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107–141. [PubMed] [CrossRef] [PubMed]
Vogel, E. K. Woodman, G. F. Luck, S. J. (2001). Storage of features, conjunctions and objects in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 27, 92–114. [PubMed] [CrossRef] [PubMed]
Vogel, E. K. Woodman, G. F. Luck, S. J. (2006). The time course of consolidation in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 32, 1436–1451. [PubMed] [CrossRef] [PubMed]
Vogels, R. (1990). Population coding of stimulus orientation by striate cortical cells. Biological Cybernetics, 64, 25–31. [PubMed] [CrossRef] [PubMed]
Wheeler, M. E. Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48–64. [PubMed] [CrossRef] [PubMed]
Wilken, P. Ma, W. J. (2004). A detection theory account of change detection. Journal of Vision, 4, (12):11, 1120–1135, http://journalofvision.org/4/12/11/, doi:10.1167/4.12.11. [PubMed] [Article] [CrossRef]
Wolfe, J. M. Cave, K. R. (1999). The psychophysical evidence for a binding problem in human vision. Neuron, 24, 11–17. [PubMed] [CrossRef] [PubMed]
Woodman, G. F. Vogel, E. K. (2005). Fractionating working memory: Consolidation and maintenance are independent processes. Psychological Science, 16, 106–113. [PubMed] [CrossRef] [PubMed]
Xu, Y. Chun, M. M. (2005). Dissociable neural mechanisms supporting visual short-term memory for objects. Nature, 440, 91–95. [PubMed] [CrossRef] [PubMed]
Zhang, W. Luck, S. J. (2008). Discrete fixed-resolution representations in visual working memory. Nature, 453, 233–235. [PubMed] [Article] [CrossRef] [PubMed]
Zhang, W. Luck, S. J. (2009). Sudden death and gradual decay in visual working memory. Psychological Science, 20, 423–428. [PubMed] [CrossRef] [PubMed]
Figure 1
 
Precision of visual working memory in a color report task. (a) Subjects were briefly presented with a sample array of 1–6 colored squares; exposure duration was varied across trials (100–2000 ms). After a blank period (900 ms), a test array was presented in which the location of a randomly selected sample item was highlighted. Subjects reported the remembered color corresponding to the highlighted location by clicking on a color wheel. (b) Precision as a function of the number of items in the sample array (N). Precision is defined as the reciprocal of the standard deviation of the error in subjects' responses: zero indicates chance performance. Error bars indicate SEM. The blue line indicates the best fit to the data of a power law relating precision to the fraction of resources available per item (1/N). (c) Three models for the distribution of responses on the color report task, illustrated for a single trial with a sample array of two items (one red, one green) and a test array that cues the location of the red item. Variability in memory for color alone would predict a Gaussian distribution of responses centered on the actual color at the target location (top). In the model proposed by Zhang and Luck (2008) (middle), a proportion of responses instead come from a uniform distribution in which colors are chosen at random (shown in green). Alternatively (bottom), variability in memory for location may cause subjects to mistake which item was at the target location on some trials, in which case a proportion of responses (shown in green) will come from a Gaussian centered on the non-target color.
Figure 1
 
Precision of visual working memory in a color report task. (a) Subjects were briefly presented with a sample array of 1–6 colored squares; exposure duration was varied across trials (100–2000 ms). After a blank period (900 ms), a test array was presented in which the location of a randomly selected sample item was highlighted. Subjects reported the remembered color corresponding to the highlighted location by clicking on a color wheel. (b) Precision as a function of the number of items in the sample array (N). Precision is defined as the reciprocal of the standard deviation of the error in subjects' responses: zero indicates chance performance. Error bars indicate SEM. The blue line indicates the best fit to the data of a power law relating precision to the fraction of resources available per item (1/N). (c) Three models for the distribution of responses on the color report task, illustrated for a single trial with a sample array of two items (one red, one green) and a test array that cues the location of the red item. Variability in memory for color alone would predict a Gaussian distribution of responses centered on the actual color at the target location (top). In the model proposed by Zhang and Luck (2008) (middle), a proportion of responses instead come from a uniform distribution in which colors are chosen at random (shown in green). Alternatively (bottom), variability in memory for location may cause subjects to mistake which item was at the target location on some trials, in which case a proportion of responses (shown in green) will come from a Gaussian centered on the non-target color.
Figure 2
 
Distribution of errors relative to target and non-target colors. (a) Frequency of response as a function of the difference between reported color value and target color value, for varying numbers of items (N) in the sample array. The long tails of the distribution observed for larger sample sizes (e.g., six items, far right) are inconsistent with the simple Gaussian model shown in Figure 1c, top, but are consistent with either of the other models shown in Figure 1c. Colored lines indicate the response probabilities predicted by a mixture model combining color error, location error, and random components (see main text and Figure 3). (b) Frequency of responses as a function of the difference between reported color value and each non-target color value. The strong central tendency observed for larger numbers of items (N) is not predicted by Zhang and Luck's (2008) model (Figure 1c, middle) but is consistent with errors in memory for location, as illustrated in (Figure 1c, bottom). Colored line indicates the prediction of the three-component model. Error bars indicate SEM.
Figure 2
 
Distribution of errors relative to target and non-target colors. (a) Frequency of response as a function of the difference between reported color value and target color value, for varying numbers of items (N) in the sample array. The long tails of the distribution observed for larger sample sizes (e.g., six items, far right) are inconsistent with the simple Gaussian model shown in Figure 1c, top, but are consistent with either of the other models shown in Figure 1c. Colored lines indicate the response probabilities predicted by a mixture model combining color error, location error, and random components (see main text and Figure 3). (b) Frequency of responses as a function of the difference between reported color value and each non-target color value. The strong central tendency observed for larger numbers of items (N) is not predicted by Zhang and Luck's (2008) model (Figure 1c, middle) but is consistent with errors in memory for location, as illustrated in (Figure 1c, bottom). Colored line indicates the prediction of the three-component model. Error bars indicate SEM.
Figure 3
 
Three sources of error in the report task and the effect of sample duration. (a–c) Subject responses on the memory task were decomposed into three separate components, indicated by the shaded regions: (a) a Gaussian distribution with standard deviation σ centered on the target color value (T), corresponding to error in memory for color; (b) Gaussian distributions with the same width centered on each non-target color value (NT), corresponding to errors in memory for location; (c) a uniform distribution, capturing random responses unrelated to any of the sample colors. (d–f) Maximum likelihood parameters of the three-component model, as a function of number of items in the sample array (mean across sample durations). (d) The standard deviation (σ) increases with array size, indicating increasing variability in memory for color; (e) the proportion of responses corresponding to non-targets increases with array size, indicating increasing variability in memory for location; (f) the proportion of random responses is shown in black; the gray dashed lines indicate the proportions of random responses expected for a fixed upper limit of 2, 3, or 4 items. (g–i) Effect of sample duration on each parameter of the model: light gray symbols and dotted line, 100 ms; dark gray symbols and dashed line, 500 ms; black symbols and solid line, 2 s. Sample duration does not effect variability in memory for color (g) or location (h) but has a substantial effect on the frequency of random responses (i). Error bars in this figure indicate within-subject SEM, as in Zhang and Luck (2008).
Figure 3
 
Three sources of error in the report task and the effect of sample duration. (a–c) Subject responses on the memory task were decomposed into three separate components, indicated by the shaded regions: (a) a Gaussian distribution with standard deviation σ centered on the target color value (T), corresponding to error in memory for color; (b) Gaussian distributions with the same width centered on each non-target color value (NT), corresponding to errors in memory for location; (c) a uniform distribution, capturing random responses unrelated to any of the sample colors. (d–f) Maximum likelihood parameters of the three-component model, as a function of number of items in the sample array (mean across sample durations). (d) The standard deviation (σ) increases with array size, indicating increasing variability in memory for color; (e) the proportion of responses corresponding to non-targets increases with array size, indicating increasing variability in memory for location; (f) the proportion of random responses is shown in black; the gray dashed lines indicate the proportions of random responses expected for a fixed upper limit of 2, 3, or 4 items. (g–i) Effect of sample duration on each parameter of the model: light gray symbols and dotted line, 100 ms; dark gray symbols and dashed line, 500 ms; black symbols and solid line, 2 s. Sample duration does not effect variability in memory for color (g) or location (h) but has a substantial effect on the frequency of random responses (i). Error bars in this figure indicate within-subject SEM, as in Zhang and Luck (2008).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×