Even though the nature of confidence computations has been the topic of intense interest, little attention has been paid to what confidence response times (cRTs) reveal about the underlying confidence computations. Several previous studies found cRTs to be negatively correlated with confidence in the group as a whole and consequently hypothesized the existence of an intrinsic relationship of cRT with confidence for all subjects. This hypothesis was further used to support postdecisional models of confidence that predict that cRT and confidence should always be negatively correlated. Here we test the alternative hypothesis that cRT is driven by the frequency of confidence responses such that the most frequent confidence ratings are inherently made faster regardless of whether they are high or low. We examined cRTs in three large data sets from the Confidence Database and found that the lowest cRTs occurred for the most frequent confidence rating. In other words, subjects who gave high confidence ratings most frequently had negative confidence–cRT relationships, whereas subjects who gave low confidence ratings most frequently had positive confidence–cRT relationships. In addition, we found a strong across-subject correlation between response time and cRT, suggesting that response speed for both the decision and the confidence rating is influenced by a common factor. Our results show that cRT is not intrinsically linked to confidence and strongly challenge several postdecisional models of confidence.

*N*= 201) indicated whether a Gabor patch was tilted clockwise or counterclockwise from vertical (Figure 2A). The data set consists of two tasks. For the coarse discrimination task, the Gabor patches were embedded in noise and tilted 45 degrees away from the vertical. For the fine discrimination task, the Gabor patches were tilted about 1 degree away from vertical. The contrast in the coarse discrimination task and the tilt in the fine discrimination task varied between subjects in order to match the average performance across the two tasks. Each subject completed 100 trials for each of the two tasks. Here we combined the data from both tasks.

*N*= 443) saw a 7 × 7 grid that consisted of the letters X and O (Task 1; Figure 2B) or the colors red or blue (Task 2). Subjects indicated which letter or color occurred more frequently. In Task 1, approximately half of the subjects received trial-by-trial feedback about whether the judgment was correct while the other half received no such feedback. No feedback was given in Task 2. The proportion of the dominant stimulus was 31 of 49 for Task 1 and 27 of 49 for Task 2. Each subject completed 330 trials for Task 1 and 150 trials for Task 2. Here we again combined the data from both tasks and analyzed together subjects who did or did not receive trial-by-trial feedback.

*N*= 75) completed seven sessions over 7 different days. Each subject completed 500 trials per day and 3,500 in total. Approximately half of the subjects received trial-by-trial feedback about whether the judgment was correct, while the other half received no such feedback. We again analyzed together subjects who did or did not receive trial-by-trial feedback. Note that even though Haddara1 and Haddara2 used the same task, these data sets featured different distributions of confidence biases. Because Haddara2 includes 7 days, it is possible that these differences are due to practice effects. To check for this possibility, we separately analyzed the data from day 1 of Haddara2 (Supplementary Figure S4).

*SD*s or cRTs outside mean ± 3 ×

*SD*s before conducting any data analyses. We coded confidence ratings as scalar variables with values 1–4 when we used them for analyses.

_{cRT∼Confidence}) as an indicator of the cRT–confidence relationship for each individual. We performed linear regressions on β

_{cRT∼Confidence}as a function of groups to test the effects of groups on the cRT–confidence relationship.

_{cRT∼Confidence}in odd and even trials for each subject and correlated these values across subjects to test whether the individual differences are stable and consistent. For robustness, we also bootstrapped 100 random split-half partitions of trials for each subject and tested whether β

_{cRT∼Confidence}is correlated between the two halves. We transformed

*r*values of correlations to

*z*scores, averaged

*z*scores obtained from 100 partitions, and reported

*r*values transformed from the averaged

*z*scores.

_{correct}– cRT

_{error}) for each subject. We performed linear regressions on cRT

_{correct}– cRT

_{error}as a function of groups to test the effects of groups on the cRT–accuracy relationship and examined the cRT–accuracy relationship across different data sets. To determine the effect of accuracy on cRT at the population level, we performed paired-sample

*t*-tests comparing cRT for correct and error trials. We also tested the cRT–accuracy relationship at the individual level by separately computing cRT

_{correct}– cRT

_{error}in odd and even trials for each subject and correlating these values across subjects (Supplementary Figure S6). We also bootstrapped 100 random split-half partitions of trials for each subject for cRT

_{correct}– cRT

_{error}.

_{cRT∼Confidence}) (Bang:

*t*(49) = 5.94,

*p*= 2.9 × 10

^{−7}, Cohen's

*d*= .84, BF

_{10}= 5.2 × 10

^{4}; Haddara1:

*t*(31) = 2.87,

*p*= 0.007, Cohen's

*d*= .51, BF

_{10}= 5.70; Haddara2:

*t*(19) = 4.14,

*p*= 5.6 × 10

^{−4}, Cohen's

*d*= −.93, BF

_{10}= 60.61; Figure 3B), while subjects in Group 4 (who rated the highest confidence level the most frequently) showed a negative cRT–confidence relationship (Bang:

*t*(58) = −5.33,

*p*= 1.7 × 10

^{−6}, Cohen's

*d*= −.69, BF

_{10}= 9.8 × 10

^{3}; Haddara1:

*t*(190) = −14.75,

*p*= 2.5 × 10

^{−33}, Cohen's

*d*= −1.07, BF

_{10}= 1.1 × 10

^{30}; Haddara2:

*t*(8) = −1.90,

*p*= 0.09, Cohen's

*d*= −.63, BF

_{10}= 1.14). Analyzing all groups together, we found that the slope of the cRT–confidence relationship (i.e., β

_{cRT∼Confidence}) decreased for the groups in which the most frequent confidence rating was higher (Bang: slope = −34.66,

*t*(199) = −9.12,

*p*= 8.4 × 10

^{−17}, Cohen's

*d*= −.64; Haddara1: slope = −35.05,

*t*(440) = −10.34,

*p*= 1.3 × 10

^{−22}, Cohen's

*d*= −.49; Haddara2: slope = −34.22,

*t*(73) = −4.52,

*p*= 2.4 × 10

^{−5}, Cohen's

*d*= −.53; Figure 3B). These results strongly support Hypothesis 2 and demonstrate that the patterns in cRT results are largely determined by the identity of the most frequently chosen confidence rating.

_{cRT∼Confidence}), which is exactly what we found (Bang:

*r*= .48,

*p*= 4.4 × 10

^{−4}; Haddara1:

*r*= .44,

*p*= 0.01; Haddara2:

*r*= .60,

*p*= 0.005; Figure 3C). Conversely, among subjects who rated confidence = 4 most frequently (i.e., Group 4), subjects with higher proportions of confidence = 4 responses should exhibit smaller cRT–confidence slopes (β

_{cRT∼Confidence}), which is again what we found (Bang:

*r*= −.62,

*p*= 1.2 × 10

^{−7}; Haddara1:

*r*= −.38,

*p*= 6.7 × 10

^{−8}; Haddara2:

*r*= −.75,

*p*= 0.02). Therefore, Hypothesis 2 is further supported by these within-group analyses (note that Hypothesis 1 predicts no such correlations for any group).

*t*(417.24) = −7.16,

*p*= 3.6 × 10

^{−12}, Cohen's

*d*= −.35, BF10 = 1.5 × 10

^{9}; Figure 4B). However, the other two data sets (Haddara2 and Bang) featured relatively more balanced subgroup sizes (Figure 4A), which should result in much weaker overall relationships between cRT and confidence in the whole group. Indeed, we found no significant correlation between cRT and confidence at the population level in Haddara2 (slope = 6.20, 95% CI [−12.95, 25.36],

*t*(74.52) = .64,

*p*= 0.52, Cohen's

*d*= .07, BF10 = .15; Figure 4B) and a slightly positive correlation in Bang (slope = 11.78, 95% CI [1.88, 21.68],

*t*(186.92) = 2.34,

*p*= 0.02, Cohen's

*d*= .17, BF10 = 1.16). These results suggest that previous results of the population-level negative cRT–confidence relationship were likely due to most subjects having high confidence in those data sets. Indeed, this type of bias is clearly present in the Moran et al. (2015) data set (see Figure 4 in that article) and in the Herregods et al. (2023) data set (see Figure 7 in that article). These results demonstrate that the group-level cRT–confidence relationship is not fixed and depends on the overall level of bias toward low- or high-confidence responses in each data set.

_{correct}– cRT

_{error}) became smaller for the groups for which the most frequent confidence rating was higher (Bang: slope = −18.80,

*t*(199) = −4.24,

*p*= 3.4 × 10

^{−5}, Cohen's

*d*= −.30; Haddara1: slope = −11.89,

*t*(441) = −4.30,

*p*= 2.1 × 10

^{−5}, Cohen's

*d*= −.20; Haddara2: slope = −11.80,

*t*(73) = −4.11,

*p*= 1.0 × 10

^{−4}, Cohen's

*d*= −.48; Figure 5A). In addition, similarly to the group-level cRT–confidence relationship (Figure 4B), we found that cRT was lower for correct compared with error trials in Haddara 1 (

*t*(442) = −11.67,

*p*= 1.3 × 10

^{−27}, Cohen's

*d*= −.55, BF

_{10}= 2.2 × 10

^{24}; Figure 5B) but not in the other two data sets (Bang:

*t*(200) = −.13,

*p*= 0.90, Cohen's

*d*= −.009, BF

_{10}= .07; Haddara2:

*t*(74) = −.62,

*p*= 0.54, Cohen's

*d*= −.07, BF

_{10}= .15). These results show that just as with the cRT–confidence relationship, the cRT–accuracy relationship is driven by each subject's confidence bias (i.e., the frequency with which they choose each confidence rating).

*r*= .69,

*p*= 2.1 × 10

^{−29}, BF

_{10}= 1.5 × 10

^{26}; Haddara1:

*r*= .59,

*p*= 2.8 × 10

^{−43}, BF

_{10}= 7.5 × 10

^{39}; Haddara2:

*r*= .41,

*p*= 3.0 × 10

^{−4}, BF

_{10}= 76.57; Figure 6). These results suggest that the same factor contributes to response speed for both the decision and the confidence rating.

*Journal of Experimental Psychology: General*, 148(3), 437. [CrossRef] [PubMed]

*Journal of Experimental Psychology: Human Perception and Performance*, 24(3), 929. [PubMed]

*Journal of Neuroscience*, 35(8), 3478–3484. [CrossRef] [PubMed]

*Cognitive Psychology*, 57(3), 153–178. [CrossRef] [PubMed]

*Psychological Science*, 29(5), 761–778. [CrossRef] [PubMed]

*Elife*, 10, e67556. [PubMed]

*Journal of Neuroscience*, 33(4), 1400–1410. [PubMed]

*Behavior Research Methods, Instruments, & Computers,*30(1), 146–156.

*Psychological Review*, 124(1), 91. [PubMed]

*Philosophical Transactions of the Royal Society B: Biological Sciences*, 367(1594), 1280–1286.

*Signal detection theory and psychophysics*(Vol. 1, pp. 1969–2012). New York, NY: Wiley.

*Psychological Science*, 33(2), 259–275. [PubMed]

*Psychological Review*, 119(1), 186. [PubMed]

*Quarterly Journal of Experimental Psychology,*65(5), 865–886.

*Neuroscience of Consciousness*, 2016(1), niw002. [PubMed]

*Visual Cognition,*9(45), 477–501.

*Metacognition: Knowing about knowing*. Cambridge, MA: MIT Press.

*Journal of Experimental Psychology: Human Perception and Performance,*24(5), 1521.

*Cognitive Psychology*, 78, 99–147. [PubMed]

*Acta Psychologica,*35(4), 316–327.

*Psychology of Learning and Motivation,*26, 125–141.

*Proceedings of the National Academy of Sciences*, 117(15), 8382–8390.

*Psychological Review*, 117(3), 864. [PubMed]

*Perspectives on Psychological Science*, 17(6), 1746–1765. [PubMed]

*Nature Human Behaviour*, 4(3), 317–325. [PubMed]

*Journal of Neuroscience*, 31(29), 10741–10748. [PubMed]

*Neural Computation*, 20(4), 873–922. [PubMed]

*Psychological Review*, 111(2), 333. [PubMed]

*Attention, Perception, & Psychophysics*, 80(1), 134–154. [PubMed]

*Psychological Review*, 128(1), 45. [PubMed]

*Consciousness and Cognition*, 9(2), 313–323. [PubMed]

*Philosophical Transactions of the Royal Society B: Biological Sciences*, 367(1594), 1310–1321.

*Journal of Experimental Psychology: General,*144(2), 489. [PubMed]