July 2019
Volume 19, Issue 7
Open Access
Article  |   July 2019
Differential trajectories of memory quality and guessing across sequential reports from working memory
Author Affiliations
  • Benjamin Peters
    Institute of Medical Psychology, Goethe University, Frankfurt am Main, Germany
    Zuckerman Institute, Columbia University, New York, USA
    peters@columbia.edu
  • Benjamin Rahm
    Medical Psychology and Medical Sociology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
    benjamin.rahm@mps.uni-freiburg.de
  • Jochen Kaiser
    Institute of Medical Psychology, Goethe University, Frankfurt am Main, Germany
    j.kaiser@med.uni-frankfurt.de
  • Christoph Bledowski
    Institute of Medical Psychology, Goethe University, Frankfurt am Main, Germany
    bledowski@em.uni-frankfurt.de
Journal of Vision July 2019, Vol.19, 3. doi:10.1167/19.7.3
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Benjamin Peters, Benjamin Rahm, Jochen Kaiser, Christoph Bledowski; Differential trajectories of memory quality and guessing across sequential reports from working memory. Journal of Vision 2019;19(7):3. doi: 10.1167/19.7.3.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Working memory enables the storage of few items for a short period of time. Previous research has shown that items in working memory cannot be accessed equally well, indicating that they are held in at least two different states with different capacity limitations. However, it is unclear whether differences between states are due to limitations of the number of items that can be stored, or the quality with which items are stored. We employed a sequential whole-report procedure where participants reported the remembered orientation of each of two or four encoded Gabor patches. In addition, they rated their memory confidence prior to each report. Participants performed 600 trials per condition, allowing us to obtain reliable subjective ratings and estimates of precision, guessing, and misreport using a mixture model, separately for each sequential report. Different measures of memory quality consistently showed discontinuous trajectories across reports with a steep drop from the first to the second remembered item but only slight decreases thereafter. In contrast, both reported and modeled guessing changed continuously across reports. Our results support the notion of two states in working memory and show that they are distinguished by memory quality rather than quantity.

Introduction
Accessing relevant information even when it is no longer physically present is an essential component of goal-directed behavior. Working memory supports this ability by storing information over short periods of time, allowing its flexible access. However, the capacity of working memory is highly limited. Various accounts have been proposed to characterize the nature of this limitation. Some researchers have attributed it to a tightly limited number of discrete “slots” for storage (Zhang & Luck, 2008), causing items to be forgotten when more items than slots are presented. Others have proposed that the capacity of working memory is best conceptualized as a continuous resource that is shared between all presented items (Bays & Husain, 2008). Consequently, when more items are stored, less resource is available per item, resulting in a poorer memory quality. 
Studies on capacity limitation typically assume that all items in working memory are held in the same single state. However, the general architecture of working memory is often conceived as comprising at least two different states (Cowan, 1988; McElree, 2006; Oberauer, 2002; Unsworth & Engle, 2007), with the term “states” typically connoting the accessibility of an item for ongoing cognitive operations. The first state, named primary memory or focus of attention (FoA), was hypothesized to be privileged and active as items here can be directly accessed. The second state, secondary memory or activated long term memory (LTM), was considered to be passive as the access to items requires an active retrieval operation to enter the privileged state. 
Support for this dual-state architecture of working memory came mainly from behavioral studies that have reported different reaction times for items that were designed to be held in different states (McElree, 2006; Oberauer, 2002) and by findings of separable brain regions involved in accessing items in different states (Bledowski, Rahm, & Rowe, 2009; Nee & Jonides, 2014). While states were typically assessed via item accessibility, little is known about the memory quality with which items are held in these states. 
Recently, we have provided evidence that items in working memory are reported from two qualitatively different states of memory quality: from a first, high-quality state, or from a second, lower-quality state (Peters et al., 2018). Specifically, in contrast to a typical working memory task where multiple items are encoded while only a single item is probed, we applied a procedure where all encoded items had to be reported sequentially. We estimated memory quality by calculating “response precision” as the inverse width of the response error. In four experiments we consistently found that response precision decreased steeply after the first report but continued decreasing smoothly thereafter. According to the framework of dual-state models, we suggested that this discontinuous trajectory of decrease in response precision across reports indicates that the first item was reported from a different state as compared to all subsequently reported items.1 The latter were cumulatively affected by interference elicited either by the item report or by an additional distracting task. We tested this two-state architecture formally by comparing whether the trajectory of response precision across reports was best described as (a) a discontinuous model assuming a qualitative change in representational states of items after the first report or as (b) a continuous (exponential) model assuming a single representational state that cumulatively degrades in precision by a certain percentage with every interference caused by an item report. We found that an exponential function reflecting a cumulative degradation across reports (continuous model) was not sufficient to describe the steep drop of response precision from the first to the second report. In contrast, the steep drop was better captured by a discontinuous model that allowed for a deviation of the first report precision from the exponential model using an additional parameter. The notion of two qualitative states was further supported by the observation that the first and the later reports were differentially sensitive to experimental manipulations: Only the precision of the first report was affected by retro-cuing before reports (experiment 2 of Peters et al., 2018) or by interference during the retention period (a simple perceptual task) before the first report (experiments 3 and 4 of Peters et al., 2018). Recently, Adam et al. (Adam, Vogel, & Awh, 2017) also applied a sequential whole-report procedure to assess memory precision as a function of report order. When the sequence in which items had to be reported was randomized as in our study, they also found that memory precision dropped more steeply from the first to the second report than for further reports. Together, consistent findings across multiple experiments have indicated that prior to the first report, or prior to interference, items were held in a state of high resolution. After the first report, or after interference, items fell into a second state of lower resolution. 
The notion of two states in working memory that store items with different quality seems a plausible interpretation for these findings. It is, however, possible that the observed decline in response precision for the later reports was due to a limited storage quality or to a limitation of the number of items that can be transferred to or stored in the presumed second state. Specifically, after the first report, or after interference, only a subset of the presented items may still exist and be accessible. Hence, forcing participants to report all the presented items could lead to an increased guessing rate, resulting in an overall precision decline across trials for reports from the second state. 
Mixture models are commonly applied to disentangle different sources that contribute to response precision. In such a model, the response error distribution can be reproduced by several components. The first component is a Gaussian distribution centered on the correct feature value of the reported item. The concentration of this distribution (model parameter κ) reflects the memory quality of the memory representation of this item. Although this component is often called “precision,” here we use the term “modeled precision” to avoid confusion with the above mentioned “response precision” that is calculated as the inverse width (SD) of the response error distribution. A second component refers to a uniform distribution that is meant to correspond to the proportion of trials on which participants respond at random (modeled guessing). Bays et al. (2009) have proposed a third component that reflects responses based on erroneous recall of nontarget features, i.e., features of an item that formed part of the memory display but was not the one probed for recall (modeled misreport). Although conceptually compelling and widely used, mixture modeling has recently been criticized. First, a valid separation of modeled precision and guessing might not be possible in conditions where participants perform poorly (Lawrence, 2010). Second, response errors may not represent a linear measure. Schurgin, Wixted, and Brady (2018) have suggested that they are rather scaled to reflect “psychological distance.” Consequently, frequent high-deviation responses in an experimental condition would not be conceived as stemming from a high guessing rate, but rather as indicating a low d′. However, another way to dissociate low-quality memory from guessing is to obtain memory confidence ratings for each report. Subjective ratings correlate with response precision, modeled precision, and guessing, indicating that people have reasonable metacognitive knowledge about their memory (Adam et al., 2017; Rademaker, Tredway, & Tong, 2012; van den Berg, van den Berg, & Ma, 2017). Confidence ratings and reported feature values thus provide separate estimates of both memory quality and guessing across the report sequences. 
To test whether the response precision difference between the two states in working memory was attributable to memory quality or to the number of items that can be stored, the present study used both objective and subjective measures as well as their combination. Specifically, as in our previous study we employed the whole-report procedure by asking participants to successively report each of two or four encoded Gabor patches from working memory. In addition, we also incorporated subjective confidence ratings of memory quality just before each report of the whole-report procedure. Participants performed 600 trials per condition on four days to obtain reliable estimates of subjective ratings of memory quality as well as modeled precision, guessing, and misreport using the mixture model, separately for each sequential report. We hypothesized that if the two states in working memory are distinguished by memory quality, this should be reflected by a difference in modeled precision and in the proportion of high confidence ratings. In contrast, if the second state is characterized by a limitation of the number of stored items, both states should differ mainly in the level of self-reported and modeled guessing and/or the number of modeled misreports. 
Methods
Participants
Ten students (five women, five men; average age 21.7 years) of Goethe University Frankfurt or Fresenius University of Applied Sciences Frankfurt with normal or corrected-to-normal vision participated in the study. Participants received financial compensation (€ 80 for approximately 8 hr) and gave written informed consent. This research was approved by the ethics committee of the Goethe University Medical Faculty. 
Procedure
The experimental paradigm including the whole-report procedure was comparable to our previous study (Peters et al., 2018). At the beginning of each trial a memory array of two or four randomly oriented Gabor patches was presented for 400 ms (Figure 1a). Their orientations differed by at least 5° between stimuli. Gabor patches had a Gaussian envelope of 0.8° and a wavelength of 0.8°/cycle. They could appear at eight possible locations equidistantly placed on an invisible circle with a radius of 8.2° around the center of the screen. After a retention interval of 1,000 ms, a white circle with a radius of 2° centered on the location of one of the items was highlighted. Participants were asked first to report their confidence about their memory at that location on a 4-point scale from 0 (forgotten, i.e., no memory at all) to 3 (best possible memory) by keyboard button press. Then, a randomly oriented Gabor appeared at that location and participants adjusted the orientation of the probe to match the remembered orientation as closely as possible by horizontal mouse movements. The final adjustment was confirmed with a mouse click. This procedure was repeated until all memorized items, i.e., two or four depending on the set size, were reported. The sequence of reports was randomized and every item was reported only once. No response time limit was imposed. After the last report, participants received feedback concurrently for all reports, in the form of small colored discs presented centrally on an invisible circle with a radius of 0.8° around the fixation cross (0.66°) at locations that corresponded clearly to locations of probed orientations in the current trial. Colors of the discs ranged from green to red, with green indicating perfect report and red meaning that the adjusted probe orientation was perpendicular to the presented Gabor. Feedback was presented for 500 ms. Trials were separated by a blank intertrial interval of 1,000 ms. Participants were encouraged to move their eyes during the intertrial interval only. 
Figure 1
 
Paradigm and results of error distribution and response precision. (a) Participants memorized two or four orientations of Gabor gratings and after a delay were asked to rate their confidence about the memory quality and to report the memorized orientation for each of them in random order (only set size four displayed). (b) Distribution of response errors for each set size and report position and (c) the corresponding response precision. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 1
 
Paradigm and results of error distribution and response precision. (a) Participants memorized two or four orientations of Gabor gratings and after a delay were asked to rate their confidence about the memory quality and to report the memorized orientation for each of them in random order (only set size four displayed). (b) Distribution of response errors for each set size and report position and (c) the corresponding response precision. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
The experiment was programmed in Presentation (Version 14.9, Neurobehavioral Systems, Inc., Berkeley, CA) and presented via an LCD monitor with a 60-Hz refresh rate. Participants conducted 30 practice trials and subsequently completed 10 blocks of 30 trials in each of four sessions on separate days. This resulted in 600 trials per set size. 
It is important to note that, in contrast to the set size four condition, the set size two condition did not directly contribute to our study question, i.e., whether the states in working memory are distinguished by memory quality or the number of items that can be stored. However, it fulfilled two important functions: First, it facilitated relating the current results to our previous study that had used set sizes two, four, six, and eight. Second, it allowed comparing the strength of the discontinuity to the set size effect. Whereas set sizes six or eight would have been favorable in terms of data modeling, set size two was the only condition that was practicable in terms of recording a high number of trials (600 trials per set size condition). 
Data analysis
For each trial, we computed response errors as the angular difference between the presented orientation and the response given by the participant. To compare the results with our previous study (Peters et al., 2018), we first computed response precision as the inverse of the circular standard deviation of the response error distribution (corrected for the estimation bias by subtracting the expectation value under uniform response errors to yield a response precision of zero for uniformly distributed response errors; see Bays et al., 2009). Response precision was calculated separately for each participant and condition. Moreover, we computed precision for each of four levels of reported confidence. Response precision was only computed if at least ten responses were given (leading to an exclusion of three subjects for the analysis of the response precision at the first report position of set size four associated with the lowest confidence rating). 
Confidence ratings were analyzed in two ways. First, we averaged the confidence ratings for each report position, set size, and participant. Subsequently, we analyzed separately the proportion of high confidence ratings (rating = 3) and the proportion of ratings that indicated forgetting (rating = 0). 
To assess the impact of sequential report position, we conducted two repeated-measures ANOVAs with report position as a single factor, separately for set size two and set size four, with reports 1–2 and reports 1–4 as levels, respectively. In addition, to assess the impact of set size, and its interaction with report position, we computed a 2 × 2 repeated-measures ANOVA that included the factor report position with the first two reports as its levels and the factor set size with the levels two and four. Whenever assumptions of sphericity were violated as indicated by a significant Mauchly test statistic (p < 0.05), Greenhouse-Geisser-corrected degrees of freedom are reported. 
Mixture modeling
To quantify the contribution of different sources of error to overall response precision estimates, we modeled the response error distributions (scaled to the range from minus pi to pi) with a mixture model comprising three parameters (Bays et al., 2009): (a) Reporting the cued item with probability ptarget from a circular normal distribution centered on the original orientation and with precision κ. κ thereby directly reflects the memory quality with which an item is available for report (modeled precision), (b) guessing with probability pguessing in case that the target item cannot be retrieved (modeled guessing), or (c) incorrectly reporting another, uncued item with probability pnonTarget (modeled misreport). In the case of guessing, the response error is sampled from a uniform distribution on the support of –π to π. Estimates of ptarget, pnonTarget and κ were obtained as those parameters that minimized the negative log-likelihood of the response errors under the model. In order to enforce the constraint that ptarget + pnonTarget + pguessing = 1, we performed parameter optimization on transformed parameters pmemory, as ptarget + pnonTarget, and ptargetsOfMemory as ptarget /(ptarget + pnonTarget). After optimization, these parameters were transformed back to ptarget, pnonTarget, and pguessing. Optimization was performed with the Nelder-Mead method as implemented in the function fminsearch in MATLAB (MathWorks, Natick, MA). We used fminsearchbnd to constrain parameters between 0 as a lower limit and 1, 1, and 100 for pmemory, ptargetsOfMemory and κ as upper limits, respectively. Parameters were estimated independently for each set size, report position, and participant. 
Continuous versus discontinuous model comparison
To replicate the findings of our previous study, we tested whether the trajectory of a certain dependent measure (i.e., response precision, mean rating, modeled precision, modeled guessing, modeled misreport, proportion of “high confidence” ratings, and “forgotten” ratings) across the report sequence in the set size four condition was more compatible with a one-state model or a two-state model by performing a formal model comparison between a continuous model and a discontinuous model. Specifically, reports from a single state should entail a constant rate of degradation across reports resulting in an exponential (continuous) decrease of this dependent measure across the report sequence. The dependent measure y of a report at position x by participant i was modeled as  
\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\begin{equation}{y_i}\left( x \right) = {a_i}e^{ - {b_i}\left( {x - 1} \right)} + {\epsilon _{ix}}{\rm {,}}\end{equation}
where Display Formula\({b_i}\sim N\left( {b,{\sigma _b}} \right)\), slope parameter, and Display Formula\({a_i}\sim N\left( {a,{\sigma _a}} \right)\), initial value at the first report, were random effects parameters and Display Formula\({\epsilon _{ix}}\sim N\left( {0,\sigma } \right)\) the error term. Positive values of the slope parameter b indicated a gradual decline of the dependent measure across the report sequence. For the two-state model, we fit a discontinuous model that allowed for a discontinuity of the trajectory after the first report. Specifically, we modified the exponential model to allow for a deviation of the first report position from the exponential trajectory at the first report position. Here, the dependent measure y of participant i at report position x was modeled as  
\begin{equation}{y_i}\left( x \right) = {a_i}e^ { - {b_i}\left( {x - 1} \right)} + {c_i}{{\bf 1}_{\left\{ 1 \right\}}}\left( x \right) + {\epsilon _{ix}}{\rm {,}}\end{equation}
with 1{1}(x) being the indicator function; hence, 1{1}(1) = 1 and 1{1}(x) = 0 for all x ≠ 1. The random effects parameter Display Formula\({c_i}\sim N\left( {b,{\sigma _c}} \right)\), therefore captured the deviation of an exponential fit from the dependent measure at the first report position (i.e., the “first report benefit”).  
To test whether response precision, mean rating, modeled precision, modeled guessing, modeled misreport, proportion of “high confidence” ratings, and “forgotten“ ratings changed continuously (or discontinuously), we fitted the continuous and discontinuous model to these dependent measures as described above. However, for the model comparison with modeled guessing, modeled misreport, and proportion of “forgotten” ratings, we first subtracted these parameters from 1. This is because the successive deterioration across reports is reflected by an increase in these parameters, whereas the exponential term is only able to capture a degradation across reports. 
We compared the fit of the exponential model with the fit of the discontinuous model using respective (marginal) Akaike Information Criteria (AIC). Because AIC is a relative fit index, we reported the difference ΔAIC as the AIC of the discontinuous model minus the AIC of the exponential model. Thus, negative AIC values denoted evidence for the discontinuous model. Furthermore, because the exponential model was nested within the discontinuous model, the likelihood ratio of the two models was asymptotically χ2(df)-distributed, with df as the number of constrained parameters as degrees of freedom. We therefore tested whether relaxing model constraints by introducing a new parameter (first report benefit parameter for the discontinuous model) significantly increased the model likelihood by performing a likelihood ratio test using the χ2 statistic. 
Linearity assumption of dependent measures
The suggested model comparison assumed that “true” memory quality mapped linearly on our dependent measures (i.e., modeled precision, response precision) and, hence, that a discontinuous decline in dependent measures reflected a discontinuous decline in “true” memory. However, it is possible that a linear decline in “true” memory quality across reports could appear discontinuous in the dependent measure as a trivial consequence of a monotonic, but nonlinear, mapping. For example, this would be the case if changes in dependent measures were not linear, but proportional, so that the same degree of degradation of “true” memory quality would lead to different degrees of change in observed as well as modeled dependent measures. If such a monotonic nonlinear mapping “contracted” the space of dependent measures at lower levels of memory quality, this would systematically affect report position 2 to 4 in the set size four condition to a stronger degree than report position 1. If this was the case, then measures of memory quality at a level below that in set size four condition should be affected at least to the same amount. In consequence, at those lower levels it should not be possible to detect a discontinuity. However, that was not what we observed in our previous work (see figure 1 in Peters et al., 2018); rather, we could find a discontinuous trajectory in response precision at set size six and even at set size eight, i.e., at clearly lower levels of response precision as compared to the current set size four condition. These findings are hard to reconcile with the assumption of a too strongly nonlinear mapping between “true” quality and response precision. This indicates that the observed discontinuities in memory quality across reports were meaningful with respect to the “true” memory precision. Hence, our model comparisons in the set size four condition in the present study were justified and able to detect nontrivial discontinuities in memory quality across reports. 
Comparison of the first report benefit across dependent measures
To test whether two dependent measures (e.g., modeled precision and modeled guessing) differed in their first report benefit (i.e., their degree of discontinuity), we fit both dependent measures in a single model with either identical or differing first report benefits (i.e., c parameters). To be able to compare the c parameter across dependent measures, we reparametrized the discontinuous model. In particular, we modeled the dependent measure y of a report at position x by participant i as  
\begin{equation}{y_i}( x ) = {a_i}e^ { - {b_i}( {x - 1} )} + {c_i}{a_i}( {1 - e^ { - {b_i}} )} {{\bf 1}_{\{ 1 \}}}( x ) + {\epsilon _{ix}}{\rm {,}}\end{equation}
where the first report benefit c is now expressed as multiples of the exponential decrease from the first to the second report position [note: Display Formula\({a_i}e^{ - {b_i}( {1 - 1})} - {a_i}e^{ - {b_i}({2 - 1})} = {a_i}({1 - e^{ - {b_i}})}\)]. 
We then fit the two measures simultaneously in two different models. In the c-different model all parameters (Display Formula\(a,b,c,{\sigma _a},{\sigma _b},{\sigma _c},{\rm{\sigma }}\)aiebi(1 − 1)aiebi(2 − 1) = ai(1 − ebi)].  
We then fit the two measures simultaneously in two different models. In the c-different model all parameters (Display Formula\(a,b,c,{\sigma _a},{\sigma _b},{\sigma _c},{\rm{\sigma }}\)) were estimated independently for each dependent measure. In the c-same model, the first report benefit Display Formula\(c\) and its standard deviation Display Formula\({\sigma _c}\) were set to be equal for the two dependent measures. 
To be able to simultaneously estimate independent error variances for the two measures, we performed parameter estimation via Stan (Stan Development Team (2018); http://mc-stan.org) using the No-U-Turn Markov Chain Monte Carlo algorithm (NUTS MCMC) to sample from the stationary posterior distribution (six independent chains, 48.000 samples, 2.000 samples burn-in, all parameter Display Formula\(\hat R\) ≤ 1.01). We used weakly informative priors on the parameters Display Formula\(N\left( {0,10} \right),\) for each parameter Display Formula\(a,b,c\) and Display Formula\(p\left( {{\sigma ^2}} \right) = {\left( {1 + {\sigma ^2}} \right)^{ - 2}}\) for each of the parameter standard deviations (Display Formula\({\sigma _a},{\sigma _b},{\sigma _c},{\rm{\sigma }}\)). We then computed the likelihood ratio of the c-different and c-same model. Instead of comparing only a point estimate of the likelihood (i.e., the maximum) we marginalized over the parameters using bridge-sampling (Gronau et al., 2017). The resulting likelihood ratio therefore amounts to the Bayes Factor with values > 1 indicating a better fit of the c-different model (i.e., evidence for different first report benefits). 
Results
Response precision decreases discontinuously across the report sequence
The current results replicated the findings from our previous study (Peters et al., 2018) by showing a discontinuous decrease of response precision, measured as the width of the response error distribution (inverted SD). That is, we observed a steep drop from the first to the second report and only a slight continuous decrease thereafter (Figure 1b, 1c). Response precision significantly decreased across the report sequence in both set size conditions: main effect of report position for set size two, F(1, 9) = 39.5, p = 0.0001, Display Formula\(\eta _p^2\) = 0.81; for set size four, F(1.1, 9.5) = 26.5, p < 0.0001, Display Formula\(\eta _p^2\) = 0.74. Importantly, the decrease in response precision was better explained by the discontinuous model than by the exponential model, ΔAIC = −34.43, χ2(2) = 38.43, p < 0.001, first report benefit = 0.245; see Table 1 for parameter estimates of the winning model, confirming a discontinuity in the trajectory of response precision across report positions. 
Table 1
 
Parameters of the winning model and information criteria for the exponential (AICexp) and the discontinuous model (AICdisc). Notes: AIC = Akaike Information Criteria for discontinuous (disc) and exponential (exp) model. * Parameter values correspond to fitting the continuous/discontinuous model to the trajectory of 1 minus the original trajectory; x parameter value effectively zero to machine precision.
Table 1
 
Parameters of the winning model and information criteria for the exponential (AICexp) and the discontinuous model (AICdisc). Notes: AIC = Akaike Information Criteria for discontinuous (disc) and exponential (exp) model. * Parameter values correspond to fitting the continuous/discontinuous model to the trajectory of 1 minus the original trajectory; x parameter value effectively zero to machine precision.
Confidence ratings predict response precision
To test whether confidence ratings represent a viable measure of memory quality, we compared statistically the width of the response error distribution between confidence ratings (Figure 2a). Response precision was significantly modulated by confidence ratings: main effect of confidence ratings, F(1.5, 13.1) = 129.7, p < 0.0001, Display Formula\(\eta _p^2\) = 0.935. Response precision was highest when participants rated their memory quality to be highest and close to zero when they stated they had forgotten the item: mean response precision for rating 3, 1.32, ±0.09, mean ± standard error; mean response precision for rating 2, 0.66, ± 0.06; mean response precision for rating 1, 0.26, ± 0.05; mean response precision for rating 0, 0.09, ± 0.02. This confirms that confidence ratings provide a valid measure of memory quality (Rademaker et al., 2012). 
Figure 2
 
Confidence rating results. (a) Distribution of response errors associated with each rating collapsed across set sizes. Inset barplot denotes the corresponding response precision. (b) Mean rating for each report position and set size. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 2
 
Confidence rating results. (a) Distribution of response errors associated with each rating collapsed across set sizes. Inset barplot denotes the corresponding response precision. (b) Mean rating for each report position and set size. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Confidence decreases discontinuously across the report sequence
In the next step, we evaluated whether the confidence ratings across successive reports also revealed a discontinuous decline as observed for response precision. We therefore averaged the confidence ratings for each report position and set size to obtain an estimate of the mean confidence. Figure 2b shows that mean confidence ratings across reports followed a similar trajectory as response precision. Mean confidence dropped steeply from the first to the second report, but decreased only slightly thereafter in the set size four condition: main effect of report position for set size four, F(3, 27) = 42.3, p < 0.0001, Display Formula\(\eta _p^2\) = 0.82. For set size two, the main effect of report position was not significant, F(1, 9) = 3.8, p = 0.0834, Display Formula\(\eta _p^2\) = 0.30. Importantly, as for response precision, the model comparison for confidence in the set size four condition favored the discontinuous over the continuous model, ΔAIC = −19.75, χ2(2) = 23.75, p < 0.001, c = 0.177. This result confirmed a discontinuity in the trajectory of subjective mean confidence ratings across report positions. Again, these results support previous findings (Rademaker et al., 2012) that confidence ratings provide a valid measure of the quality of a memory representation. 
Modeled precision but not modeled guessing decreases discontinuously across reports
To investigate the possible sources producing the discontinuous trajectory of response precision decrease across reports, we modeled the response error distributions with a mixture model that assumes three potential sources of errors: modeled precision, guessing, and misreport. As shown in Figure 3b, modeled precision decreased across the report sequence in both set size conditions: main effect of report position for set size two, F(1, 9) = 26.0, p = 0.0006, Display Formula\(\eta _p^2\) = 0.74; for set size four, F(1.6, 14.7) = 7.7, p = 0.007, Display Formula\(\eta _p^2\) = 0.46. Similarly, we observed a significant change in guessing across reports in both set size conditions: main effect of report position for set size two, F(1, 9) = 16.8, p = 0.0027, Display Formula\(\eta _p^2\) = 0.65; for set size four, F(3, 27) = 16.6, p < 0.0001, Display Formula\(\eta _p^2\) = 0.65. In contrast, misreport was unaffected by report position in either condition: main effect of report position for set size two, F(1, 9) = 0.5, p = 0.5108, Display Formula\(\eta _p^2\) = 0.05; for set size four, F(3, 27) = 0.7, p = 0.5519, Display Formula\(\eta _p^2\) = 0.07. 
Figure 3
 
Mixture modeling. (a) The mixture model estimates three parameters that reproduce the distribution of response errors: precision (κ: inverse variance of target-centered Gaussian), guessing, and misreports. (b) Maximum likelihood estimates of the three mixture model components. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 3
 
Mixture modeling. (a) The mixture model estimates three parameters that reproduce the distribution of response errors: precision (κ: inverse variance of target-centered Gaussian), guessing, and misreports. (b) Maximum likelihood estimates of the three mixture model components. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Most importantly, we found that only modeled precision varied discontinuously between report positions. It dropped markedly from the first to the second report but decreased only very slightly thereafter (set size four condition). This discontinuous trajectory was supported by the model comparison that favored the discontinuous over the continuous model, ΔAIC = −11.31, χ2(2) = 15.31, p = < 0.001, c = 1.415. In contrast, we found a continuous rather than discontinuous trajectory for both other parameters. For guessing, the additional parameter of the discontinuous model did not yield any significant improvement of the data likelihood over the continuous model, χ2(2) = 2.64, p = 0.268, c = 0.121. Model comparison favored the continuous model (ΔAIC = 1.36) indicating the absence of a discontinuity after the first report position. Similarly, misreport was also better captured by a continuous model as the data likelihood was not significantly increased by the discontinuous model, ΔAIC = 3.74, χ2(2) = 0.26, p = 0.877, c = 0.015. 
For the sake of completeness, we also report the effect of set size and its interaction with report position on response precision as well as on the three model parameters as indicated by the corresponding 2 × 2 ANOVA with the factors set size (2 and 4) and report position (1 and 2). Consistent with a large body of working memory literature (Bays et al., 2009; Wilken & Ma, 2004; Zhang & Luck, 2008), we found a significant decrease in response precision from set size two to set size four, F(1, 9) = 107.4, p < 0.0001, Display Formula\(\eta _p^2\) = 0.92. The set size effect was also clearly present for all three model parameters: modeled precision loss, F(1, 9) = 62.6, p < 0.0001, Display Formula\(\eta _p^2\) = 0.87; guessing, F(1, 9) = 22.0, p = 0.0011, Display Formula\(\eta _p^2\) = 0.71; and misreport, F(1, 9) = 9.0, p = 0.0150, Display Formula\(\eta _p^2\) = 0.50. We also observed that the drop of response precision and the increase of guessing from the first to the second report were steeper for set size two than for set size four. This was reflected by a significant interaction between report position and set size: response precision, F(1, 9) = 5.5, p = 0.0441, Display Formula\(\eta _p^2\) = 0.38; guessing, F(1, 9) = 6.2, p = 0.0344, Display Formula\(\eta _p^2\) = 0.41. There were no such interactions for modeled precision: F(1, 9) = 0.2, p = 0.6584, Display Formula\(\eta _p^2\) = 0.02; or misreport, F(1, 9) = 1.0, p = 0.3444, Display Formula\(\eta _p^2\) = 0.10. 
Response precision decreases discontinuously even for high confidence ratings
Mixture modeling suggested that modeled precision rather than guessing was responsible for the difference in memory quality between the two states. As the validity of mixture modeling has recently been questioned (Lawrence, 2010; Schurgin et al., 2018), we aimed to assess the trajectory of response precision across reports by accounting for differences in subjective confidence ratings. Specifically, we tested the trajectory for response precision across reports when subjects were confident that they did remember the item, i.e., when according to their self-report they did not guess. We evaluated response precision across successive reports separately for each level of confidence rating and set size. When participants gave a rating of “1,” “2,” or “3,” indicating low, medium, or high confidence, respectively, we found that response precision decreased significantly across reports in both set size conditions, except for low confidence in the set size two condition; confidence rating of “1”: main effect of report position for set size two, F(1, 9) = 5.1, p = 0.0512, Display Formula\(\eta _p^2\) = 0.36; and for set size four, F(1.4, 12.4) = 3.0, p = 0.097, Display Formula\(\eta _p^2\) = 0.25; confidence rating of “2”: main effect of report position for set size two, F(1, 9) = 49.9, p = 0.0001, Display Formula\(\eta _p^2\) = 0.85; and set size four, F(1.6, 14.1) = 14.2, p < 0.0001, Display Formula\(\eta _p^2\) = 0.61; confidence rating of “3”: main effect of report position for set size two, F(1, 9) = 22.6, p = 0.0010, Display Formula\(\eta _p^2\) = 0.72; and for set size four, F(3, 27) = 20.9, p < 0.0001, Display Formula\(\eta _p^2\) = 0.70. In contrast, we did not find such a decrease in response precision for the confidence rating of “0,” corresponding to forgetting (see Figure 4a): main effect of report position for set size four, F(3, 18) = 0.6, p = 0.614, Display Formula\(\eta _p^2\) = 0.09. Note that because participants gave a rating of “0” very rarely in the set size two condition, precluding the computation of the response precision, we could not compute the corresponding ANOVA. Again and most importantly, for the set size four condition, we observed a steep drop of response precision from the first to the second report and a slight decrease thereafter only if participants reported to have low, medium, or high confidence about the memorized orientation but not if they reported to have forgotten the orientation. These observations were supported by the model comparisons that favored the discontinuous over the continuous model for confidence ratings of “1,” “2,” and “3”; confidence rating of “1”: ΔAIC = −10.55, χ2(2) = 14.55, p < 0.001, first report benefit = 0.058; confidence rating of “2”: ΔAIC = −9.11, χ2(2) = 13.11, p = 0.001, first report benefit = 0.198; confidence rating of “3”: ΔAIC = −3.68, χ2(2) = 7.68, p = 0.021, first report benefit = 0.421. In contrast, there was no significant advantage of the discontinuous over the continuous model for the confidence rating of “0”: ΔAIC = −0.05, χ2(2) = 4.04, p = 0.132, first report benefit = 0.043. 
Figure 4
 
Response precision for the two extreme ratings and their proportions. (a) Response precision for every response accompanied with a confidence rating of 3 (indicating the highest memory quality, left) and 0 (indicating forgetting, right), is shown separately for every report position and set size. (b) Proportion of confidence ratings of 3 (left) and 0 (right), separately for every report position and set size. Error bars indicate the standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 4
 
Response precision for the two extreme ratings and their proportions. (a) Response precision for every response accompanied with a confidence rating of 3 (indicating the highest memory quality, left) and 0 (indicating forgetting, right), is shown separately for every report position and set size. (b) Proportion of confidence ratings of 3 (left) and 0 (right), separately for every report position and set size. Error bars indicate the standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Reported forgetting increases continuously
Mixture modeling suggested that only guessing increased continuously across reports. To exclude the possibility that this resulted from inaccurate estimation of the guessing parameters (Lawrence, 2010), we assessed the proportion with which participants rated an item as forgotten (confidence rating = 0). The proportion of subjective forgetting did not increase significantly across reports for set size two; main effect of report precision: F(1, 9) = 5.9, p = 0.04, Display Formula\(\eta _p^2\) = 0.40, but did increase significantly for set size four; main effect for report precision: F(1.4, 12.4) = 17.9, p < 0.001, Display Formula\(\eta _p^2\) = 0.67 (Figure 4b, right panel). Hence, each report increased the probability of forgetting an item. Importantly, this increase of subjective forgetting in set size four was best captured by a continuous as compared to a discontinuous model: ΔAIC = 2.60, χ2(2) = 1.40, p = 0.50, c = −0.025. In line with the mixture modeling results, it demonstrates that the increase of items reported as forgotten does not reflect a difference between the states in working memory. In contrast, the proportion of items that were rated with the highest confidence (confidence rating = 3) significantly decreased for set size two, F(1, 9) = 3.2, p = 0.11, Display Formula\(\eta _p^2\) = 0.26; and set size four, F(1.1, 9.5) = 14.3, p = 0.004, Display Formula\(\eta _p^2\) = 0.61 across the report sequence (Figure 4b, left panel). Corroborating the findings of the average confidence ratings and the modeled precision from the mixture model, the decline in high confidence ratings was better captured by a discontinuous compared to a continuous model: ΔAIC = −33.20, χ2(2) = 37.20, p < 0.001, first report benefit = 0.134, indicating that states in working memory differ in their memory quality. 
Further evidence for discontinuity in measures of memory quality but not guessing
Our results showed a significant discontinuity across reports for objective and subjective measures of memory quality but not for guessing. To further bolster this finding, we directly compared the degree of discontinuity (i.e., the first report benefit parameter c) between critical measures of memory quality, i.e., the proportion of trials with the highest confidence rating “3,” modeled precision as estimated by mixture modeling, as well as response precision associated with the highest confidence rating “3,” and measures of guessing, i.e., the proportion of trials with the lowest confidence rating “0” and modeled guessing, respectively. For this aim, we reparametrized the discontinuous model as multiples of the exponential decrease between the first and second report position such that the c parameter can be compared across different measures (see Methods for details). We compared the first report benefits of the trajectories of proportion of highest and lowest confidence ratings across report positions. We found positive evidence (Bayes Factor, BF = 3.69) for a model in which the first report was different compared to a model with equal first report benefit, demonstrating that the first report benefit was indeed more pronounced in reports with high ratings indicating high memory quality than in low ratings indicating guessing. For modeled precision and modeled guessing, we also found positive evidence for a model with different first report benefits (BF = 2.46) but the effect was weaker than for the subjective measures of memory quality and guessing. Finally, integrating subjective and objective measures, we compared the trajectory of response precision associated with the highest confidence rating (as a measure of precision that is free from guessing responses) with the proportion of reported forgetting (as a measure of guessing). Again, we found positive evidence for a model with different first report benefits (BF = 3.26). Together, these results further support the notion that states in working memory differ in memory quality rather than in the number of items that can be stored. 
Discussion
Using subjective and objective measures incorporated into a whole-report procedure, we showed that participants remembered items at two levels of memory quality depending on the report position. Specifically, the first remembered item was frequently reported with high response precision, and participants were highly confident about their memory quality. In contrast, subsequently reported items were reported with lower response precision and received lower confidence ratings. The steep drop in response precision from the first to the second report was also present when trials were excluded in which participants reported to have no memory. Similarly, mixture modeling revealed that in particular the modeled precision parameter that indicates the memory quality dropped steeply from the first to the second report but decreased only slightly thereafter. We observed this stereotypical trajectory consistently for different measures of memory quality across reports. It was better captured by a model assuming a discontinuous than a continuous decrease, indicating that items were indeed retrieved from two qualitatively distinct states in working memory. In contrast, the proportion with which participants rated an item as forgotten and the guessing parameter from the mixture model changed continuously rather than discontinuously across reports. Additional analyses demonstrated that the drop between the first report benefit was stronger for measures of memory quality than for measures of guessing. These results suggest that forgetting took place to the same extent with every report. Together, our results show that states in working memory diverge in memory quality rather than quantity. 
There is an ongoing debate about whether items in working memory can indeed be forgotten and responses need to be guessed and whether estimating the proportion of guessing responses via mixture modeling yields meaningful estimates (Lawrence, 2010; Schurgin et al., 2018). In consequence, the way how the “long tails” of the error distribution are analyzed may significantly impact study results. Here, to address this concern we incorporated confidence ratings as separate measures of memory quality and forgetting into our experimental procedure. Others have proposed that memory quality is variable (Fougnie, Suchow, & Alvarez, 2012; Sims, Jacobs, & Knill, 2012; van den Berg, Awh, & Ma, 2014; van den Berg, Shin, Chou, George, & Ma, 2012). In consequence, what objectively looks and subjectively feels (as indicated via rating) like a forgotten item, may in fact only be a memorized item with very low quality. We cannot be sure whether all responses that were rated as “forgotten” were indeed true guessing responses or based on very low memory quality. However, this matter does not affect our conclusion that items reported form two states in working memory differ in memory quality and that this difference was not driven by guessing or responses with very low memory quality. Specifically, across different measures we consistently observed that the discontinuous decline in memory quality was enhanced, and not reduced, when our analysis focused on data from the central part of the full error distribution, i.e., on items that were remembered with medium to high memory quality. Thus, the evidence for two levels of memory quality did not depend on responses reflecting the “long tails” of the error distribution or the way how they were analyzed. 
Our findings of two states in working memory correspond well to general architecture models that distinguish between at least two states of working memory. Unsworth and Engle (2007), for example, have suggested that one state in working memory, termed primary memory, serves to store few items by means of the continued allocation of attention. Items in primary memory can be directly accessed for processing. When attention is removed from primary memory (for example, when a distracting task is performed), items are maintained in a second state, termed secondary memory. Importantly, to be processed, items from secondary memory have to be retrieved by a cue-dependent search process. This process is error prone, leading to failed or incorrect retrieval. Cowan (1988) proposed a similar architecture model but used a different terminology, distinguishing between a focus of attention and activated long-term memory for the first and second state, respectively. According to both models, differences between states in working memory are due to sustained attention necessary for storage of items in the first state and due to errors occurring during item retrieval from the second state. However, architecture models so far have not specified whether attention or error-prone retrieval affects the memory quality or quantity of items stored in the different states. This is likely due to the fact that studies that have informed the architecture models lacked measures of memory quality. These studies have typically used discrete—mostly verbal—stimuli and recorded reaction times and accuracy rates as indices of attention removal and thus the need for retrieval to as well as its error proneness. Here we used the advantage of the delayed-estimation technique, in which both the stimulus and the response space are continuous rather than discrete. We showed that items were remembered with a markedly different memory quality before versus after the first report (i.e., the distracting interference task), indicating that they were stored in qualitatively different states. This is in line with architecture models proposing that a removal of attention by an interference event transferred items from the first to the second state. Our novel contribution to these models is that the first state in working memory stores high-quality representations that are, however, highly susceptible to even a single interference event. In contrast, items transferred to the second state have a lower quality but are more robust even against multiple interference events. These events, however, continuously increased the probability of forgetting; i.e., every report increased the subjective and objective measures of guessing to the same extent. This observation is also in line with architecture models proposing that retrieving items from the second state is an error-prone process that may interfere with all concurrently held items in the second state; hence, the chance of forgetting accumulates continuously with every report. According to architecture models, one could also expect that every report will disrupt the binding between the item's content (i.e., orientation) and its cued feature (i.e., spatial position), making participants prone to retrieve the wrong item. However, we observed that the misreport parameter from mixture modeling was mostly unaffected by report position. 
Given the importance of interference for the distinction between high versus low memory quality storage, we sought to characterize in detail the interference processes that are responsible for the transfer of items from the first to the second state in working memory. In our previous study (Peters et al., 2018) we showed in two experiments that the steep drop in response precision from the first to the second report was due to the executive interference. Specifically, we have implemented executive interference by asking participants to orient a Gabor probe vertically (experiment 4 of Peters et al., 2018) or to change the color of a patch to blue (experiment 4 of Peters et al., 2018) during the retention period before first report. We found that executive interference successfully mimicked the processes involved in the first report, as after executive interference the discontinuity in response precision between the first and subsequent reports vanished. In terms of the two-state interpretation, all items in working memory could only be reported from the second state. This was in contrast to other interference events, like passive viewing of a moving Gabor probe (perceptual interference) or a prolonged retention interval (longer delay). Under both conditions, participants were still able to report items from the first state, and the discontinuity between the first and second item was still observed. Consequently, the steep drop from the first to the second report that we observed in memory quality but not in guessing rate in the present study can be attributed to executive processes during the first report. Architecture models suggest that the successful retention of items in the first state requires sustained attention. Thus, whenever an executive interference task interrupts sustained attention, e.g., by redirecting attentional resources to novel input, memory items lose their high resolution and fall into the second state of lower resolution. An interesting question for future research would be to specify the nature of the executive attention mechanisms that are responsible for transferring items from the first to the second state. 
Another interesting issue regards the question whether different states refer to different levels of activation of a single item as proposed by the architecture models or to multiple representations of an item held separately in different stores. Consequently, does the transfer from the first to the second state indicate a loss of activation resulting in a weaker memory quality? Or alternatively, does it indicate that items are held concurrently in a high and a low-quality store but after interference the access to the high-quality representation is eliminated? Recent studies on the neuronal correlates of working memory have shown that task-relevant contents are maintained in multiple brain regions including primary sensory and parietal and frontal cortices (for a review, see Christophel, Klink, Spitzer, Roelfsema, & Haynes, 2017). However, they also indicate that item representations are not simply duplicated across several brain regions but rather that representations differ in their level of abstraction with sensory regions encoding low-level sensory features and frontal and parietal regions encoding a more abstract and categorical format. Support for this view came from Bae and Luck (2019) who showed that visual interference did not only reduce the memory quality but also increased the bias towards cardinal orientations suggesting an access to more categorical representations. Future research might test whether and why removal of attention impairs low-level sensory representations in sensory cortex but leaves abstract representations in frontal and parietal cortex unchanged. 
In keeping with the literature on multistate models of working memory (e.g., Cowan, 1988; Oberauer, 2002; or Unsworth & Engle, 2007), and corresponding neural evidence (e.g., Bledowski, Rahm, & Rowe, 2009; Nee & Jonides, 2014; Christophel et al., 2017), we interpret our result of a discontinuous (not an exponential/continuous) decline in memory quality across reports in terms of different representational states of items before and after first distracting interference. However, it is also possible to explain such a steep drop using a single representational state in combination with an interference process that operates at different levels (or different interference processes) during the first report as compared to later reports. This seems quite similar to the conception of different states; however, here levels/states refer to the interference process instead of the representational status of items. Another alternative explanation could also be that the first report did not suffer from the interference arising during report. However, the process of generating the output of first report itself may have acted as interference for successive reports. This “output interference” interpretation requires less theoretical commitments as it does not depend on the conception of two representational states or two levels of interference. However, it would vastly benefit from a more detailed specification of output interference, which, to best of our knowledge, has not yet been characterized sufficiently. Notably, these interpretations are not mutually exclusive. As the first to-be-reported item is attentionally focused (and hence in an active state), this also implies enhanced protection from interference (Souza & Oberauer, 2016). Thus, the two-state and the interference explanations may actually be regarded as two sides of the same coin. Future research might help to clarify whether our interpretation in terms of multistate representational models needs to be supplemented or replaced by an interference-related process. 
For many cognitive tasks subjects typically memorize several task-relevant items. We showed that memory quality of these items varies systematically with the order with which the items are accessed and used. The sequential whole-report procedure thus provides a convenient way to measure the capacity of the two states in visual working memory. This approach also allows calculating an individual slope of decline in memory quality across reports. Future research might assess whether such an individual slope parameter is a promising measure to explore the underlying sources of working memory capacity limitation. However, so far only few studies have employed sequential reports from visual working memory (e.g., Adam, Mance, Fukuda, & Vogel, 2015; Fougnie, Suchow, & Alvarez, 2012; Woodman & Vecera, 2011). To the best of our knowledge, only one of them has also systematically investigated memory precision as a function of report order (Adam et al., 2017). In line with our findings, this study found that precision dropped more steeply from the first to the second report and declined slightly thereafter. This finding was obtained only when the item report order was randomized (experiment 2 of Adam et al., 2017). In contrast, when participants could determine the report order (experiment 1 of Adam et al., 2017), precision declined less steeply from the first to the second report and was remarkably comparable between set sizes. This suggests that the possibility to choose the report order might lead to the use of recall and encoding strategies that influence the trajectory of precision in addition to reports from different states. 
The present study provides novel evidence that states in working memory differ specifically with respect to their levels of memory quality. While removal of attention by report/interference determines whether an item is held in the first state with high quality or in the second state with lower quality, further retrieval from the second state has little impact on memory quality but leads to forgetting. 
Acknowledgments
Commercial relationships: none. 
Corresponding author: Christoph Bledowski. 
Address: Institute of Medical Psychology, Goethe University, Frankfurt am Main, Germany. 
References
Adam, K. C. S., Vogel, E. K., & Awh, E. (2017). Clear evidence for item limits in visual working memory. Cognitive Psychology, 97, 79–97, https://doi.org/10.1016/j.cogpsych.2017.07.001.
Adam, K. C., Mance, I., Fukuda, K., & Vogel, E. K. (2015). The contribution of attentional lapses to individual differences in visual working memory capacity. Journal of Cognitive Neuroscience, 27 (8) 1601–1616, https://doi.org/10.1162/jocn_a_00811.
Bae, G.-Y., & Luck, S. J. (2019). What happens to an individual visual working memory representation when it is interrupted? British Journal of Psychology, 110 (2), 268–287, https://doi.org/10.1111/bjop.12339.
Bays, P. M., & Husain, M. (2008, August 8). Dynamic shifts of limited working memory resources in human vision. Science, 321 (5890), 851–854, https://doi.org/10.1126/science.1158023.
Bays, P. M., Catalao, R. F. G., & Husain, M. (2009). The precision of visual working memory is set by allocation of a shared resource. Journal of Vision, 9 (10): 7, 1–11, https://doi.org/10.1167/9.10.7. [PubMed] [Article]
Bledowski, C., Rahm, B., & Rowe, J. B. (2009). What “works” in working memory? Separate systems for selection and updating of critical information. Journal of Neuroscience, 29 (43), 13735–13741, https://doi.org/10.1523/jneurosci.2547-09.2009.
Christophel, T. B., Klink, P. C., Spitzer, B., Roelfsema, P. R., & Haynes, J.-D. (2017). The distributed nature of working memory. Trends in Cognitive Sciences, 21 (2), 1–14, https://doi.org/10.1016/j.tics.2016.12.007.
Cowan, N. (1988). Evolving Conceptions of Memory Storage, Selective Attention, and Their Mutual Constraints within the Human Information-Processing System. Psychological Bulletin, 104 (2), 163–191.
Fougnie, D., Suchow, J. W., & Alvarez, G. A. (2012). Variability in the quality of visual working memory. Nature Communications, 3, 1229, https://doi.org/10.1038/ncomms2237.
Gronau, Q. F., Sarafoglou, A., Matzke, D., Ly, A., Boehm, U., Marsman, M., et al. (2017). A tutorial on bridge sampling. Journal of Mathematical Psychology, 81, 80–97, https://doi.org/10.1016/j.jmp.2017.09.005.
Lawrence, M. A. (2010). Estimating the probability and fidelity of memory. Behavior Research Methods, 42 (4), 957–968, https://doi.org/10.3758/BRM.42.4.957.
McElree, B. (2006). Accessing recent events. Psychology of Learning and Motivation, 46, 155–200, https://doi.org/10.1016/s0079-7421(06)46005-9.
Nee, D. E., & Jonides, J. (2014). Frontal-medial temporal interactions mediate transitions among representational states in short-term memory. Journal of Neuroscience, 34 (23), 7964–7975, https://doi.org/10.1523/JNEUROSCI.0130-14.2014.
Oberauer, K. (2002). Access to information in working memory: Exploring the focus of attention. Journal of Experimental Psychology. Learning, Memory, and Cognition, 28 (3), 411–421, https://doi.org/10.1037//0278-7393.28.3.411.
Peters, B., Rahm, B., Czoschke, S., Barnes, C., Kaiser, J., & Bledowski, C. (2018). Sequential whole report accesses different states in visual working memory. Journal of Experimental Psychology. Learning, Memory, and Cognition, 44 (4), 588, https://doi.org/10.1037/xlm0000466.
Rademaker, R. L., Tredway, C. H., & Tong, F. (2012). Introspective judgments predict the precision and likelihood of successful maintenance of visual working memory. Journal of Vision, 12 (13): 21, 1–13, https://doi.org/10.1167/12.13.21. [PubMed] [Article]
Souza, A. S., & Oberauer, K. (2016). In search of the focus of attention in working memory: 13 years of the retro-cue effect. Attention, Perception, & Psychophysics, 78, 1839–1860, http://dx.doi.org/10.3758/s13414-016-1108-5.
Schurgin, M. W., Wixted, J. T., & Brady, T. F. (2018). Psychological scaling reveals a single parameter framework for visual working memory. bioRxiv, 1–25, https://doi.org/10.1101/325472.
Sims, C. R., Jacobs, R. A., & Knill, D. C. (2012). An ideal observer analysis of visual working memory. Psychological Review, 119 (4), 807–830, https://doi.org/10.1037/a0029856.
Stan Development Team (2018). RStan: the R interface to Stan. R package version 2.17.3, http://mc-stan.org/.
Unsworth, N., & Engle, R. W. (2007). The nature of individual differences in working memory capacity: Active maintenance in primary memory and controlled search from secondary memory. Psychological Review, 114 (1), 104–132, https://doi.org/10.1037/0033-295X.114.1.104.
van den Berg, R., Awh, E., & Ma, W. J. (2014). Factorial comparison of working memory models. Psychological Review, 121 (1), 124–149, https://doi.org/10.1037/a0035234.
van den Berg, R., & Ma, W. J. (2017). Fechner's Law in metacognition: A quantitative model of visual working memory confidence. Psychological Review, 124 (2), 197–214, https://doi.org/10.1037/rev0000060.supp.
van den Berg, R., Shin, H., Chou, W. C., George, R., & Ma, W. J. (2012). Variability in encoding precision accounts for visual short-term memory limitations. Proceedings of the National Academy of Sciences, USA, 109 (22): 8780–8785, https://doi.org/10.1073/pnas.1117465109.
Wilken, P., & Ma, W. J. (2004). A detection theory account of change detection. Journal of Vision, 4 (12): 11, 1120–1135, https://doi.org/10.1167/4.12.11. [PubMed] [Article]
Woodman, G. F., & Vecera, S. P. (2011). The cost of accessing an object's feature stored in visual working memory. Visual Cognition, 19 (1), 1–12, https://doi.org/10.1080/13506285.2010.521140.
Zhang, W., & Luck, S. J. (2008, May 8). Discrete fixed-resolution representations in visual working memory. Nature, 453 (7192), 233–235, https://doi.org/10.1038/nature06860.
Footnotes
1  It is important to note that while the discontinuous decline across reports was compatible with the commonly used multistate models, it did not prove the existence of two representational states. It does not exclude other interpretations of the findings, which are presented in the Discussion.
Figure 1
 
Paradigm and results of error distribution and response precision. (a) Participants memorized two or four orientations of Gabor gratings and after a delay were asked to rate their confidence about the memory quality and to report the memorized orientation for each of them in random order (only set size four displayed). (b) Distribution of response errors for each set size and report position and (c) the corresponding response precision. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 1
 
Paradigm and results of error distribution and response precision. (a) Participants memorized two or four orientations of Gabor gratings and after a delay were asked to rate their confidence about the memory quality and to report the memorized orientation for each of them in random order (only set size four displayed). (b) Distribution of response errors for each set size and report position and (c) the corresponding response precision. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 2
 
Confidence rating results. (a) Distribution of response errors associated with each rating collapsed across set sizes. Inset barplot denotes the corresponding response precision. (b) Mean rating for each report position and set size. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 2
 
Confidence rating results. (a) Distribution of response errors associated with each rating collapsed across set sizes. Inset barplot denotes the corresponding response precision. (b) Mean rating for each report position and set size. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 3
 
Mixture modeling. (a) The mixture model estimates three parameters that reproduce the distribution of response errors: precision (κ: inverse variance of target-centered Gaussian), guessing, and misreports. (b) Maximum likelihood estimates of the three mixture model components. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 3
 
Mixture modeling. (a) The mixture model estimates three parameters that reproduce the distribution of response errors: precision (κ: inverse variance of target-centered Gaussian), guessing, and misreports. (b) Maximum likelihood estimates of the three mixture model components. Error bars indicate standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 4
 
Response precision for the two extreme ratings and their proportions. (a) Response precision for every response accompanied with a confidence rating of 3 (indicating the highest memory quality, left) and 0 (indicating forgetting, right), is shown separately for every report position and set size. (b) Proportion of confidence ratings of 3 (left) and 0 (right), separately for every report position and set size. Error bars indicate the standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Figure 4
 
Response precision for the two extreme ratings and their proportions. (a) Response precision for every response accompanied with a confidence rating of 3 (indicating the highest memory quality, left) and 0 (indicating forgetting, right), is shown separately for every report position and set size. (b) Proportion of confidence ratings of 3 (left) and 0 (right), separately for every report position and set size. Error bars indicate the standard error of the mean across participants. Asterisk denotes an advantage of a discontinuous over a continuous model for the decline trajectory across reports.
Table 1
 
Parameters of the winning model and information criteria for the exponential (AICexp) and the discontinuous model (AICdisc). Notes: AIC = Akaike Information Criteria for discontinuous (disc) and exponential (exp) model. * Parameter values correspond to fitting the continuous/discontinuous model to the trajectory of 1 minus the original trajectory; x parameter value effectively zero to machine precision.
Table 1
 
Parameters of the winning model and information criteria for the exponential (AICexp) and the discontinuous model (AICdisc). Notes: AIC = Akaike Information Criteria for discontinuous (disc) and exponential (exp) model. * Parameter values correspond to fitting the continuous/discontinuous model to the trajectory of 1 minus the original trajectory; x parameter value effectively zero to machine precision.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×