Each perceptual decision is commonly attached to a judgment of confidence in the uncertainty of that decision. Confidence is classically defined as the estimate of the posterior probability of the decision to be correct, given the evidence. Here we argue that correctness is neither a valid normative statement of what observers should be doing after their perceptual decision nor a proper descriptive statement of what they actually do. Instead, we propose that perceivers aim at being self-consistent with themselves. We present behavioral evidence obtained in two separate psychophysical experiments that human observers achieve that aim. In one experiment adaptation led to aftereffects, and in the other prior stimulus occurrences were manipulated. We show that confidence judgments perfectly follow changes in perceptual reports and response times, regardless of the nature of the bias. Although observers are able to judge the validity of their percepts, they are oblivious to how biased these percepts are. Focusing on self-consistency rather than correctness leads us to interpret confidence as an estimate of the reliability of one's perceptual decision rather than a distance to an unattainable truth.

*) are likely to be incorrect (Figure 1C). In contrast, these trials will still be mostly self-consistent with past similar trials (Figure 1D). This difference between correctness and self-consistency has an impact on the estimated confidence sensitivity. Confidence sensitivity can be measured as the area under the Type II receiving operating characteristic (ROC) curve that plots the Type II hit against false alarm rates. The Type II hit rate is usually defined as the conditional probability of reporting a high-confidence judgment given that the perceptual decision was correct, and the Type II false alarm rate is the conditional probability of reporting a high-confidence judgment given that the perceptual decision was incorrect (Figure 1E). Replacing correctness by self-consistency changes the Type II ROC curve (Figure 1F) and therefore the estimated confidence sensitivity. Although correctness is well defined and controlled by the experimenter, self-consistency can only be approximated. The experimenter does not have access to the internal sensory criterion used by the observer, and this criterion may also be subject to noise and other factors such as asymmetrical rewards. However, the experimenter has access to the Type I results that give for each stimulus strength the fraction of perceptual responses for each perceptual category (here, left and right). The self-consistent decision for any stimulus strength can then be assumed to be the most frequent one (see Supplementary Material S1). Therefore, from the experimenter's perspective, one can place as many points on the Type II ROC curve as there are confidence criteria (1 point here for a high vs. low confidence judgment, or 3 points for confidence judged on a 4-point rating scale). In our simulations, using correctness instead of self-consistency would lead the experimenter to report a reduced confidence sensitivity.*

_{s}*SD*, 26.9 ± 4.2 years), including one author, with normal or correct-to-normal vision. All were experienced observers, and all but the author were naive to the purpose of the experiment. Half of the observers ran the “after-effect” experiment first, and half started with the “prior” experiment. Stimuli were displayed on a Sony Trinitron CRT GDM-F520 (Sony Corporation, Tokyo, Japan) at 57.3 cm, with three color channels linearized independently with a Minolta CS-100A Chroma Meter (Konica Minolta, Tokyo, Japan). The monitor midpoint level on all three guns (

*xyY*= 0.278, 0.307, 60.3) was used as a neutral gray reference after 5 minutes of adaptation. Colors were generated from a relative Derrington–Krauskopf–Lennie (DKL) color space (Derrington, Krauskopf, & Lennie, 1984) and then converted to red, green, and blue (RGB) values using standard procedures (Zaidi & Halevy, 1993). Colors were modulated on the L–M axis (“red–green” axis) because, unlike the other two color axes of the DKL color space, response times are symmetrical about the neutral point (Wool, Komban, Kremkow, Jansen, Li, Alonso, & Zaidi, 2015). Color units are fractions of the maximum saturation that could be displayed on the monitor along the L–M axis (

*xyY*= 0.320, 0.287, 59.8 and

*xyY*= 0.207, 0.325, 55.5) while approximately on the same equiluminant plane as the neutral gray and with the sign indicative of the direction (such that greens are negative). In other words, a color value of −0.1 means that the monitor displayed a green color that had the same luminance as the background and had 10% of the maximum saturation that the monitor was able to display; similarly for a value of +0.1, but with a red hue.

_{1}C

_{1}, O

_{2}C

_{1}, O

_{1}C

_{2}, O

_{2}C

_{2}) were counterbalanced across observers using a Latin square design where the first- and second-order probabilities were counterbalanced. For each combination the observers ran two successive blocks (eight blocks in total) of 98 trials with different task order (orientation/color) on each block, counterbalanced across observers.

*; the value of the point of subjective equality [PSE] for the neutral adapter), the amplitude (α*

_{p}*) of the after-effect (how much the PSEs are shifted, symmetrically, by the adapter), and the standard deviation (σ*

_{p}*) of sensory noise (reflecting the inverse of observer sensitivity):*

_{p}*F*is the cumulative normal function,

*S*is the stimulus value, and

*A*is the adapter value. To compare adaptation strengths and biases in the PSEs across observers, stimuli, and experiments, the amplitude of the after-effect was normalized by the observer's sensitivity for the perceptual decision: α′

_{p}= α

_{p}/σ

_{p}, and is thus expressed in units of sigma (σ

*).*

_{p}*R*

_{0}; baseline response time), response time amplitude (

*R*; difference between peak response time and baseline), and, similarly to the perceptual decisions analysis, a general bias (µ

_{A}*), an adaptation strength (α*

_{r}*), and a standard deviation (σ*

_{r}*) corresponding to the width of the function (how quickly response times decrease):*

_{r}*f*is the normal probability density function,

*S*is the stimulus value, and

*A*is the adapter value. The denominator is just a constant to restrict the range of the function between

*R*

_{0}and

*R*. The choice of this specific function to fit response times is arbitrary, but it captured data well. To compare biases and after-effect amplitudes with perceptual reports, we also normalized these parameters by the observers’ sensitivity in the perceptual decision: α′

_{a}_{r}= α

_{r}/σ

_{p}.

*C*) and minimum confidence (

_{max}*C*) to account for possible individual preferences for one modality above the other and, similarly to perceptual decisions and response times analyses, a general bias (µ

_{min}*), an adaptation strength (α*

_{r}*), and a standard deviation (σ*

_{c}*) corresponding to the width of the function:*

_{r}*f*is the normal probability density function,

*S*is the stimulus value, and

*A*is the adapter value. This specific function is also arbitrary but captured data well. Here, again, we normalized the bias and adaptation parameters by the observer sensitivity in the perceptual decision: α′

_{c}= α

_{c}/σ

_{p}.

*P*), response times (

*R*), and confidence (

*C*). For this analysis, we first collected normalized biases and adaptation terms across the 16 observers:

*A*is the vector of normalized adaptations for one metric. We then computed correlations between normalized biases and adaptation terms across the 16 observers:

_{x}*x*,

*y*) is the correlation between variables

*x*and

*y*. We also performed a principal component analysis and extracted the principal components and their associated variance:

*w*

_{(1)}is the first principal component, and Σ is the variance explained by this component.

*L*) of each model. To do so, we computed the likelihood of each perceptual response, response time, and confidence report given the predicted \(\hat P\), \(\hat R\), and \(\hat C\) of each model as described above. The overall log-likelihood is the sum of the log-likelihood of each metric on each trial. Because the models were nested, adding new parameters should only improve the fits (in practice, fitting errors sometimes cause a model with fewer parameters to fit better, although very rarely). The purpose of this analysis is to estimate whether the increase in model complexity can be justified by the improvement in the goodness of fit (estimated through the likelihood). We used the Akaike Information Criterion [AIC] (Akaike, 1974) where the likelihood of the models is penalized by the number of parameters:

*k*is the number of parameters, and

*L*is the overall log-likelihood of the model. The model with the lowest AIC score is considered to explain the data best. The simplest model has seven parameters: slope of the psychometric function, width of the response times function, response time baseline and peak, width of the confidence function, and confidence maximum and minimum values (µ

*= µ*

_{p}*= µ*

_{r}*= 0 and α*

_{c}*= α*

_{p}*= α*

_{r}*= 0). In this model, it is assumed that the observers have no bias and there is no effect of the adapter. The full model has 13 parameters and is essentially identical to the fitted functions aforementioned: the biases and adaptation strength are assumed to be independent for the three sets of curves (µ*

_{c}*≠ µ*

_{p}*≠ µ*

_{r}*and α*

_{c}*≠ α*

_{p}*≠ α*

_{r}*).*

_{c}*d*

_{O}_{,}

*is the objective sensory distance in the color task, taking into account the stimulus value (*

_{c}*S*) normalized by the observer's sensitivity (σ

_{c}*). Likewise,*

_{c}*d*

_{S}_{,}

*is the subjective sensory distance, but this time taking into account the observer's initial bias (µ*

_{c}*) and adaptation strength (*

_{c}*a*). We then computed the difference in sensory distances for each trial pair:

_{c}*d*

_{O}_{,}

*and*

_{o}*d*

_{O}_{,}

*are the objective sensory distance in the orientation and color tasks, respectively; similarly for*

_{c}*d*

_{S}_{,}

*and*

_{o}*d*

_{S}_{,}

*for the subjective sensory distance. Finally, we fitted a multivariate probit model to the observer's confidence choices using maximum likelihood estimation:*

_{c}*p*(

*o*) is the probability to report higher confidence in the response to the orientation discrimination task (rather than the color task), and Φ is the probit linking function.

*t*-test (see Supplementary Material S3, Table S1) but were not different across metrics in a paired two-tailed

*t*-test (see Supplementary Material S3, Tables S2 and S3).

*t*-test (see Supplementary Material S4, Table S6), and adaptation strength was not significantly different across metrics in a paired two-tailed

*t*-test (see Supplementary Material S4, Tables S7 and S8).

*A*between confident and non-confident trials for both the after-effect and prior experiments (Figures 5D to 5F). Here, amplitude

*A*describes how much the median response time increased relative to the baseline at the PSE. This log-ratio was smaller than 0, indicating faster responses in the confident trials. This effect was significant (see Supplementary Material S6, Table S11) for both tasks in the after-effect experiment and for the color task in the prior experiment, but not the orientation task (although it did become significant when one outlier was removed).

*t*-test in both experiments,

*t*(15) = 1.29 and 1.62,

*p*= 0.22 and 0.13, indicating that observers did not tend to report either task as more confident. In both experiments, regression weights on the difference in subjective sensory distance were significantly greater than 0 in a one-tailed

*t*-test,

*t*(15) = 11.59 and 7.51,

*p*< 0.001, and higher than regression weights on the difference in objective sensory distance,

*t*(15) = 3.75 and 4.97,

*p*= 0.002 and

*p*< 0.001). However, and importantly, regression weights on the difference in objective sensory distance were not significantly different from 0,

*t*(15) = 1.48 and 0.27,

*p*= 0.21 and 0.79. In other words, the observers’ confidence judgments were entirely based on the subjective sensory distance of the stimuli (distance to the observer's own PSE, in units of the observer's sensitivity) and not by the objective sensory distance (distance to the physical categorical boundary).

*IEEE Transactions on Automatic Control,*19(6), 716–723. [CrossRef]

*PLoS One,*6(5), e19551. [CrossRef]

*PLoS Computational Biology,*5(9): e1000504. [CrossRef]

*Vision Research,*11, 833–840. [CrossRef]

*Journal of Vision,*16(12), 537, https://doi.org/10.1167/16.12.537. [CrossRef]

*The Journal of the Acoustical Society of America,*31(5), 629–630. [CrossRef]

*Psychological Science,*25(6), 1286–1288. [CrossRef]

*Nature Neuroscience,*16(1), 105–110. [CrossRef]

*Journal of Physiology,*357, 241–265. [CrossRef]

*PLoS One,*9(5), e96511. [CrossRef]

*The Annals of Statistics,*7(1), 1–26. [CrossRef]

*Frontiers in Human Neuroscience,*8, 443. [CrossRef]

*Scientific Reports,*9(1), 7124. [CrossRef]

*Journal of Experimental Psychology,*16(1), 1–31. [CrossRef]

*The Annual Review of Vision Science,*3, 227–250. [CrossRef]

*Neural Computation,*28(9), 1840–1858. [CrossRef]

*Journal of Philosophy, Psychology and Scientific Methods,*7(17), 461–469. [CrossRef]

*Journal of Experimental Psychology: General,*129(2), 220–241. [CrossRef]

*Neuron,*84(6), 1329–1342. [CrossRef]

*Perception as Bayesian inference*. Cambridge, UK: Cambridge University Press.

*Psychological Review,*119(1), 80–113. [CrossRef]

*Attention, Perception, & Psychophysics,*82(6), 3158–3175.

*The Annual Review of Vision Science,*2, 459–481.

*Psychological Review,*https://doi.org/10.1037/rev0000312.

*Probabilistic models of the brain: Perception and neural function*(pp. 13–36). Cambridge, MA: MIT Press.

*Neuron,*88(1), 78–92.

*Journal of Vision,*14(11):5, 1–15, https://doi.org/10.1167/14.11.5.

*Memoirs of the National Academy of Sciences,*3, 73–83.

*Nature Neuroscience,*19(3), 366–374.

*Neuron,*90(3), 499–506.

*Vision Research*, 14(1), 151–152. [PubMed]

*Journal of Experimental Psychology,*18, 643–662.

*The principles of psychophysiology: A survey of modern scientific psychology.*Vol. 2.

*Sensation*. New York: Van Nostrand.

*Decision processes in visual perception*. Cambridge, MA: Academic Press.

*Sources of color science*(pp. 109–126). Cambridge, MA: MIT Press.

*Nature Neuroscience,*5(6), 598–604.

*Journal of Vision,*15(2):10, 1–11, https://doi.org/10.1167/15.2.10.

*Vision Research,*33(8), 1037–1051.