**Abstract**:

**Abstract**
In distinct experiments we examined memories for orientation and size. After viewing a randomly oriented Gabor patch (or a plain white disk of random size), observers were given unlimited time to reproduce as faithfully as possible the orientation (or size) of that *standard* stimulus with an adjustable Gabor patch (or disk). Then, with this *match* stimulus still in view, a recognition *probe* was presented. On half the trials, this probe was identical to the standard. We expected observers to classify the probe (a same/different task) on the basis of its difference from the match, which should have served as an explicit memory of the standard. Observers did better than that. Larger differences were classified as “same” when probe and standard were indeed identical. In some cases, recognition performance exceeded that of a simulated observer subject to the same matching errors, but forced to adopt the single most advantageous criterion difference between the probe and match. Recognition must have used information that was not or could not be exploited in the reproduction phase. One possible source for that information is observers' confidence in their reproduction (e.g., in their memory of the standard). Simulations confirm the enhancement of recognition performance when decision criteria are adjusted trial-by-trial, on the basis of the observer's estimated reproduction error.

*s*,

*m*,

*m*

_{0}, and

*p*; see below) have units of degree, clockwise with respect to horizontal. All disk sizes (also

*s*,

*m*,

*m*

_{0}, and

*p*) are quantified as logarithms (base 10) of diameter.

*σ*) of 0.5 degrees of visual angle. Both grating and envelope had maximum contrast. The white disks had a luminance of 260 cd/m

^{2}. Stimuli were presented on a grey, 60 cd/m

^{2}background.

*standard*, a randomly oriented Gabor (or a random size white disk), was briefly (200 ms) presented on one side of fixation. The observer then attempted to reproduce its orientation (or size,

*s*) by manipulating another stimulus (the

*match*), subsequently presented on the opposite side of fixation. Just like the standard, the match's initial orientation (or size,

*m*

_{0}) was randomly selected from a uniform distribution over all orientations (or diameters between 1.5° and 3.0°). Each press of the “c” key rotated the match 2° anticlockwise (or reduced its diameter by 2%) and each press of the “m” key rotated it 2° clockwise (or increased its diameter by 2%).

^{1}Gabor phase was randomly reselected with each keypress. To indicate satisfaction with the match's orientation (or size,

*m*), the observer pressed the space bar, initiating the trial's second, recognition phase. With the match still in view, a

*probe*Gabor (or disk) was presented at the location of the standard. On 50% of trials the orientation (or size,

*p*) of the probe was identical to

*s*. In the remaining trials the orientation (or size) of

*p*was changed with respect to

*s*by a value, ±Δs.

^{2}The value of Δ

*s*was held constant within each block of trials. In the orientation experiment, Δ

*s*took values of 3°, 5°, 7°, 14°, and 21°. In the size experiment, Δ

*s*took values of 0.04, 0.06, and 0.08. Observers had to classify

*p*as either “same” or “different” with respect to their memory of

*s*. No feedback was given. Observers performed two blocks of 100 trials at each level of difficulty in a random order. Two additional 50-trial blocks (one with Δ

*s*= 3°, the other with Δ

*s*= 5°) were run by the last author in the orientation experiment and were included in all subsequent analyses.

*s*–

*m*). Across observers, these indices had a mean and standard deviation (

*SD*) of 9.6° and 0.7°, respectively, for orientation,

^{3}and 0.048 and 0.013, respectively, for size (see Tables SM1 and SM2 in Supplementary material). Comparable values for orientation have been obtained under similar conditions (e.g., Tomassini, Morgan, & Solomon, 2010), but more precise reproduction has also been reported with different stimuli and procedures (e.g., Vandenbussche, Vogels, & Orban, 1986). Comparable values for size have also been obtained under similar conditions (e.g., Solomon, Morgan, & Chubb, 2011; note that the value 0.048 corresponds to a Weber fraction of 12% for diameter).

*oblique effect*(e.g., Appelle, 1972). The variable error for size, on the other hand, neither increased nor decreased with standard size. This invariance with standard size has become known as Weber's Law (Fechner, 1860 & 1966).

*s*).

*p – m*. As an alternative to this (null) hypothesis, we considered the possibility that observers' same/different decisions also depended upon probe identity (i.e., the difference

*p – s*). As will be discussed below, an interaction between

*p – m*and

*p – s*may have arisen from observers modulating their decision criteria according to their confidence in each match.

*p = s*trials and

*p ≠ s*trials. (Within each panel,

*p = s*trials are represented by the middle row of symbols;

*p ≠ s*trials are represented by the top and bottom rows.) For each observer, recognition responses were also segregated according to the level of difficulty (Δ

*s*). They were then maximum-likelihood fit with psychometric functions based on the cumulative Normal distribution: Finally, a series of hypotheses regarding the

*decision criterion*(

*μ*) and the

*SD*of its fluctuation (

*σ*) were subjected to chi-square tests based on the generalized likelihood ratio (Mood, Graybill, & Boes, 1974). In the data from both experiments, we found significant (at the

*α*= 0.05 level) changes in the criterion with observer, difficulty, and probe identity, but changes in the standard deviation of its fluctuation were significant only when observer and difficulty changed, not when the probe identity changed. Therefore, we constrained

*σ*

_{P}_{≠}

*=*

_{S}*σ*

_{P}_{=}

*for all of our fits. The best-fitting parameter values are available in Supplementary Material.*

_{S}*μ*(i.e., the

*p – m*values yielding equal proportions of “same” and “different” responses).

^{4}With one exception out of 20 cases in the orientation experiment (HLW, Δ

*s*= 3°) and two exceptions out of 15 cases in the size experiment (JAS, Δ

*s*= 0.06 and PS, Δ

*s*= 0.04) all hexagons are convex. Their convexity indicates that observers were less inclined to (incorrectly) say “different” when

*p*=

*s*than when

*p*≠

*s*,

*whatever the p – m difference*. Hence, they did not exclusively base their recognition judgment on the difference between

*p*and

*m*(in which case the hexagons would have had vertical sides). We can therefore reject our null hypothesis and conclude that recognition must have taken advantage of some additional information, which was not used for reproduction.

*μ*

_{P}_{≠}

*=*

_{S}*μ*

_{P}_{=S}=

*μ*and

*σ*

_{P}_{≠}

*=*

_{S}*σ*

_{P}_{=S}=

*σ*in Equation 1) that best fit the human observer's data. Each green symbol in Figure 4 shows the psychometrically matched observer's overall performance after 1000 trials with

*each*of the human observer's 200 matching errors. In 18 of 20 cases (the exceptions were KM, Δ

*s*= 5° and HLW, Δ

*s*= 3°) our human observers' performances exceeded those of psychometrically matched model observers (i.e., significantly more than 50%, using a binomial test; P < 0.001). We must infer that the two-parameter psychometric functions did not capture all of the information used by our human observers in the recognition task. Not only were their decisions based on something besides |

*p – m*|, that additional information enhanced overall performances. In the size experiment, human recognition performance (Figure 5, red symbols) was better than that of psychometrically matched observers (green symbols) in 13 of 15 cases (i.e., also significantly more than 50%; same test; the exceptions were PS, Δ

*s*= 0.04 and Δ

*s*= 0.08). A description of the blue and yellow symbols appears below, under Regression-based Models.

*p − m*| when deciding “same” or “different.” We will now demonstrate that the present recognition results are consistent with an observer whose criterion varies from trial-to-trial with the precision of each memory trace. Although we have no evidence that this criterion placement reflects a conscious strategy, we will use the term

*confidence*to describe the underlying variable. When observers are confident that their still-visible match is good (i.e., close to the standard), they effectively label all but the most similar probes as “different.” When observers have low confidence in their match, they show a greater willingness to accept some of those same probes as “same.”

*p*–

*m*| when their matching errors were large. This is in line with studies reporting that confidence (at the time of the test) and accuracy are based (at least partly) on the same source of information (such as memory strength; Busey, Tunncliff, Loftus, & Loftus, 2000) and that subjects have (conscious or unconscious) access to the fidelity of their memory (the

*trace access theory*; Hart, 1967; Burke, MacKay, Worthley, & Wade, 1991).

*s*–

*m*) were randomly selected from a an inverse Gamma distribution:

^{5}

*s*–

*m*∼

*N*(0,

*Y*), where

*Y*∼ Inv-Gamma(

*a*,

*b*),

*a*> 1,

*b*> 0.

*Y*as a matter of convenience. For one thing, all samples from it are positive, a requirement for variances. Furthermore, integrating over all possible values of

*Y*yields a relatively familiar distribution for

*s*–

*m*: the non-standardized version of Student's

*t*. When the inverse Gamma distribution is described by shape and scale parameters

*a*and

*b*, respectively:

^{6}which is guaranteed to be greater than zero. Although this formula contains two parameters, we really have only one free parameter, because var(

*s*–

*m*) is something we measure: We can approximate ordinary Signal Detection Theory by adopting a large value for

*a*. Fluctuations in memory noise will be largest when we adopt a small value for

*a*.

*Y*. The one free parameter in our model is

*a*. As noted above, it describes the shape of the variance distribution. For the simulations illustrated in Figures 2-5, we selected

*a*= 2. For Figure 2 and 4:

*b*= (10°)

^{2}. For Figure 3:

*b*= (0.04)

^{2}.

*p*=

*s*. Consequently, psychometric fits to its data form hexagons, when plotted in the format of Figures 2 and 3. As can be seen from the red symbols in Figures 4 and 5 (simulation panels), the model's overall performance is similar to that of our human observers. It also exceeds that of a psychometrically matched observer (green symbols) by an amount similar to that seen in our human observers' data.

*s*and

*m*

_{0}.

*s*and

*m*

_{0}. These regression-based model observers adopted ideal criteria on the basis of each trial's combination

*s*and

*m*

_{0}.

*p*–

*m*|, but otherwise ideal, i.e., they optimized overall performance, given the sample variance in each condition's matching errors (i.e., Var[

*s*–

*m*]).

*ideal fixed-criterion*model) would indicate at least some criterion adjustment on the basis of external factors. As can be seen in Table 1, excess correlation was present in just three out of nine cases. In no cases did the two correlations differ by more than (0.03). Thus we have little evidence in favor of our observers exploiting the oblique effect or other external influences on the precisions of their memories when adjusting criteria for recognition. Therefore, we would like to suggest that the bulk of their uncertainty-based criterion fluctuations (Shaw, 1980) were due to internal factors (e.g., attention and arousal).

Experiment | Subject | Regression | Fixed criterion |

Orientation | AG | 0.641 | 0.656 |

JAS | 0.654 | 0.672 | |

KM | 0.705 | 0.699 | |

HLW | 0.654 | 0.65 | |

Simulation | 0.743 | ||

Size | AG | 0.535 | 0.526 |

KM | 0.655 | 0.665 | |

JAS | 0.589 | 0.629 | |

FL | 0.531 | 0.55 | |

PS | 0.623 | 0.633 | |

Simulation | 0.753 |

*s*or

*m*

_{0}, we felt a regression analysis of its data would be unnecessary. That is why there are no yellow symbols in the simulation panels of Figures 4 and 5, and that is why there are two empty cells in Table 1. On the other hand, we did feel it would be interesting to correlate the variable-precision model's recognition responses with those of the fixed-criterion otherwise-ideal observer. Those correlations (0.743 and 0.753) were quite a bit higher than any derived from our humans' data. This suggests that some criterion fluctuation is independent of uncertainty.

*different*response when probe and standard were actually the same was more stringent (i.e., required a larger difference between the probe and the visible match) than when the probe and standard were not the same.

*not*the consequence of the probe presentation but are modulated prior to it according to observers' confidence in their match. In its turn, the latter is related to the noisiness of the memory trace (or of the coding process) as reflected by the difference between match and standard. When the probe is identical to the standard (i.e.,

*signal*trials), absolute differences between standard ( = probe) and match are a direct reflection of this noise (memory strength or coding efficiency, hence of the confidence) associated with that trial (Busey et al., 2000). Thus, large differences between probe and match will be associated with large criterion settings. When the probe differs from the standard, probe vs. match differences will be less correlated with the memory/coding noise and hence will not correlate with observers' criteria (see also footnote ). As a consequence, both data and modeling show that observers uniformly demonstrate a greater willingness to accept probes as identical to the standard when they really are, regardless of their similarity to the match.

^{7}

*are*different. This was the behavior of our observers.

*Journal of the Optical Society of America A*

*,*4(12), 2372–2378. [CrossRef]

*Psychological Bulletin*

*,*78

*,*266–278. [CrossRef] [PubMed]

*Journal of Verbal Learning & Behavior*

*,*6, 325–337.

*Psychonomic Bulletin & Review*

*,*7(1), 26–48. [CrossRef] [PubMed]

*Neuron*

*,*69(4), 818–831. [CrossRef] [PubMed]

*Journal of Neuroscience*, 30(45), 15241–15453. [CrossRef] [PubMed]

*Neuron*, 59(3), 509–21. [CrossRef] [PubMed]

*Proceedings of the National Academy of Sciences of the United States of America*

*,*108(49), 19767–19771. [CrossRef] [PubMed]

*Frontiers in Neuroscience*

*,*6(6), doi: 10.3389/fnins.2012.00075.

*Trends in Cognitive Sciences*

*,*12(3), 114–122. [PubMed]

*Elements of psychophysics (E. Adler, Trans.)*. In Howe D. H. Boring E. G. (Eds.). New York: Holt, Rinehart and Winston.

*Statistical analysis of circular data*. Cambridge: Cambridge University Press.

*Signal detection theory and psychophysics*. New York: Wiley.

*Journal of Verbal Learning & Verbal Behavior*

*,*6, 685–691. [CrossRef]

*Science*

*,*324, 759–764. [CrossRef] [PubMed]

*Psychological Review*

*,*100, 609–639. [CrossRef] [PubMed]

*Journal of Experimental Psychology: General*

*,*124, 311–333. [CrossRef]

*Detection theory: A user's guide (2nd ed.)*. Mahwah: Lawrence Erlbaum.

*Nature*, 459(7243), 89–92. [CrossRef] [PubMed]

*Introduction to the theory of statistics (3rd ed.)*. New York: McGraw-Hill.

*Vision: Coding and efficiency*(pp. 3–24). New York: Cambridge University Press.

*Attention and performance (Vol. VIII)*. (pp. 277–296). Hillsdale, NJ: Erlbaum.

*Journal of Vision*, 11(12):13, 1–11, http://www.journalofvision.org/content/11/12/13, doi:10.1167/11.12.13. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Mathematical Psychology*7, 259. [CrossRef]

*Vision Research*

*,*50(5), 541–547. [CrossRef] [PubMed]

*Psychological Review*

*,*91(1), 68–111. [CrossRef]

*Proceedings of the National Academy of Sciences of the United States of America*

*,*109(22), 8780–8785. [CrossRef] [PubMed]

*Investigative Ophthalmology & Visual Science*

*,*27(2): 237–245, http://www.iovs.org/content/27/2/237. [PubMed] [Article] [PubMed]

*Journal of Mathematical Psychology*

*,*5, 102–122. [CrossRef]

^{2}In other words,

*p*∈ {

*s*– Δ

*s*,

*s*,

*s*, + Δ

*s*}. However, due to a programming error in the size experiment, for observers AG and KM (not the others)

*p*∈ {

*s*+ log(2-10

^{ΔS})

*s*,

*s*, + Δ

*s*}. As can be seen from Figure 3,

*s*+ log(2-10

^{ΔS}) is very similar to

*s*– Δ

*s*.

^{3}In this paper, all angles (such as

*s*–

*m*) are signed, acute, and analyzed arithmetically. For comparison, the average

*SD*of our axial data (Fisher, 1993; pp. 31–37) was 2.6.

^{4}The frequency of trials in which sgn(

*p – s*) = sgn(

*p*–

*m*) naturally decreases as Δ

*s*increases. Nonetheless, using the aforementioned chi-square test, in almost every case we confirmed that there would be no significant increase in the maximum likelihood of cumulative normal fits when one set of parameter values (

*μ*and

*σ*) was used for those trials and another set was allowed for the remaining

*p*≠

*s*trials. [The sole exception was JAS's size data with Δ

*s*= 0.06. In this condition, JAS responded “same” on all 13 trials for which sgn(

*p – s*) = sgn(

*p*–

*m*).] Consequently, it seems reasonable to use a single set of parameter values for all of the

*p*≠

*s*trials in each panel, and that is why each hexagon is regular.

^{5}It may be easier to first consider an observer with a more extreme case of nonstationarity. This observer either perfectly

*remembers*(R) or entirely

*forgets*(F) the standard

*S*. In the former case, his response will be “same” for

*p*=

*s*trials and “different” otherwise. When this observer forgets, he will respond randomly whether

*p = s*or not. The probabilities of a “different” response when

*p = s*and

*p ≠ s*are then given by: Pr(“different”

*p = s*) = Pr(

*p = s,*F) × Pr(“different”|F) p(“different”

*p ≠ s*) = Pr(

*p ≠ s*, R) + Pr(

*p ≠ s*, F) × Pr(“different”|F) = Pr(

*p ≠ s*, R) + Pr(

*p = s*, F) × Pr(“different”|F) > Pr(“different”

*p = s*).

^{8}When modelling the performance of AG and KM in the size experiment, a slightly different decision rule was required, due to the fact that |

*p*−

*s*| could assume one of three values (see footnote 2). In this case, the decision rule was: respond “same” if and only if

*C*

_{L}<

*p*−

*m*<

*C*

_{H}. Numerical methods were used to find the negative and positive criteria (

*C*

_{L}and

*C*

_{H}, respectively) that maximised Pr(Correct).

*s*–

*m*) for any given combination of standard size

*s*and starting size

*m*

_{0}: where

*c*

_{0},

*c*

_{1},

*c*

_{2}, and

*c*

_{3}are arbitrary constants. The residual matching error (for which the other terms do not account) is contained in the term

*ε*. Note this model contains the possibility of an interaction between the factors

_{c}*s*and

*m*

_{0}. For each observer, Equation A1 was simultaneously fit to all matches. Analyses of variance (ANOVA) indicate significant effects (

*P*< 0.05) of starting size in the data from all observers, significant effects of standard size in the data from three observers (KM, JAS, and FL), and significant interactions in the data from no observers.

*s*–

*m*

_{0}|, not to

*m*

_{0}. Consistent with Weber's Law, ANOVA failed to turn up a significant difference between

*v*

_{1}and zero for any observer. The same thing occurred for the coefficient of interaction

*v*

_{3}. On the other hand, four of our five observers (JAS was the exception) had significant effects of starting error.

*SD*s) about the constant error.

*c*

_{1}determines the overall error size and the parameter

*c*

_{2}determines the (near intercardinal) orientations at which the tendency for clockwise errors equals that for anti-clockwise errors. To model the variable error, the residuals in Equation A3 were fit with the following model: Note that there really is no firm theory behind either of these equations. They are provided merely to produce curves that illustrate the effects of standard orientation and starting error. For example, in Equation A4, the right-hand side is the sum of two full-wave rectified sine functions, which has been elevated so that its minimum is greater than zero. The fancy bit with the arcsine allows each effect to reach its maximum at an arbitrary orientation without moving the local minima away from −90, 0, and 90°. Large values of

*v*

_{1}correspond to large oblique effects. For observers AG, JAS, KM, and HLW, the best-fitting values for this parameter were 4°, 6°, 4°, and 1°, respectively. That is, some observers (especially JAS) exhibited stronger oblique effects than others (especially HLW).

*s*–

*m*is drawn from a zero-mean Gaussian distribution having variance

*Y*, i.e.,

*s*–

*m*∼

*N*(0,

*Y*). Furthermore, we assume that observers respond “different” if and only if |

*p*−

*m*| >

*C*> 0 , where the

*C*is known as the criterion.

^{8}In this Appendix we describe how to calculate the best possible criterion

*c*for any value of variance

_{y}*y*(i.e., regardless whether or not that variance itself is a random variable, as in the variable precision model).

*p*–

*m*is

*f*

_{p}_{=}

*(*

_{s}*p*–

*m*) = 1/ $y$

*ϕ*[(

*p*–

*m*)/ $y$ ], where

*ϕ*is the Normal probability density function. On the other half of the trials, in which

*p*=

*s*± Δ

*s*, the density is

*f*

_{p}_{≠}

*.(*

_{s}*p*–

*m*) = 1/2 $y$ {

*ϕ*[(

*p*–

*m*− Δ

*s*)/ $y$ ] +

*ϕ*[(

*p*–

*m*+ Δ

*s*)/ $y$ }.

*c*, will be correct with probability $12[\u222b\u2212ccfp=s(x)dx]$ + $12[1\u2212\u222b\u2212ccfp\u2260s(x)dx]$ .

*s – m*| was smaller or larger than the median matching error (solid and dashed curves, respectively). With two exceptions (AG and KM with the two smallest Δ

*s*values) out of the 20 cases (5 Δ × 4 Obs) for orientation and two exceptions (JAS medium Δ

*s*; PS small Δ

*s*), dashed curves are shifted to the right of the solid ones, indicating that observers adopt higher criteria for larger matching errors.

**.**

**.**