**Simple visual features, such as orientation, are thought to be represented in the spiking of visual neurons using population codes. I show that optimal decoding of such activity predicts characteristic deviations from the normal distribution of errors at low gains. Examining human perception of orientation stimuli, I show that these predicted deviations are present at near-threshold levels of contrast. The findings may provide a neural-level explanation for the appearance of a threshold in perceptual awareness whereby stimuli are categorized as seen or unseen. As well as varying in error magnitude, perceptual judgments differ in certainty about what was observed. I demonstrate that variations in the total spiking activity of a neural population can account for the empirical relationship between subjective confidence and precision. These results establish population coding and decoding as the neural basis of perception and perceptual confidence.**

*SD*of Gaussian envelope, 0.75°) presented at display center on a gray background. Stimuli were presented within an annulus (white, radius 4°), which was always present on the display.

*V*statistic for nonuniformity of circular data. Recall precision was defined as 1/

*σ*

^{2}where

*R*is the resultant length. Hypotheses regarding the effects of experimental parameters (contrast, subjective confidence rating) were tested with

*t*tests.

*M*idealized neurons with orientation tuning and contrast sensitivity. The average response of the

*i*th neuron to visual input was defined as (Albrecht & Hamilton, 1982; Carandini & Heeger, 2012; Heeger, 1992) where

*θ*is the stimulus orientation,

*c*is the stimulus contrast, and

*f*(

_{i}*θ*) is a Von Mises tuning function, centered on

*φ*, the neuron's preferred orientation where

_{i}*γ*is the population gain. Preferred orientations were evenly distributed throughout the range of possible orientations. Spiking activity was modeled as a homogeneous Poisson process such that the probability of a neuron generating

*n*spikes in time

*T*was

*θ̂*=

*θ*⊕

_{MAP}*β*, where

*β*is a response bias term, and ⊕ indicates addition on the circle. Decoding time

*T*was fixed at 100 ms. I considered the limit

*M*→ ∞. The model therefore has five free parameters:

*σ*and

*α*, constants of the contrast response function;

*γ*, the population gain;

*κ*, the tuning curve width; and

*β*, the response bias.

*φ*

_{(1)},

*φ*

_{(2)}, …

*φ*

_{(}

_{m}_{)}, where the notation

*φ*

_{(}

_{i}_{)}indicates the preferred orientation of the neuron that generated the

*i*th of

*m*spikes. The error in the decoded orientation, Δ

*θ*=

_{MAP}*θ*⊖

_{MAP}*θ*, can then be written as where

*ε*=

_{i}*φ*

_{(}

_{i}_{)}⊖

*θ*. Setting the derivative of the term to be maximized to zero, we obtain

*M*neurons by a continuous uniform distribution, this probability is given by

*θ*is the resultant angle (Equation 11) of a Von Mises (circular normal) random walk (Equation 12) of

_{MAP}*m*steps. It follows that the error for a given resultant length

*r*is Von Mises distributed (Mardia & Jupp, 2009): where the distribution of

*r*for

*m*steps is given by where

*rψ*

_{m}(

*r*) is the probability density function for resultant length

*r*of a uniform random walk of

*m*steps. The distribution of

*m*, the total spike count during the decoding interval

*T*, being a sum of

*M*independent Poisson distributions, is itself Poisson: where

*ξ*is the expected total spike count

*θ*and hence of the response error Δ

_{MAP}*θ*=

*θ̂*⊖

*θ*= Δ

*θ*⊕

_{MAP}*β*, for any values of the free parameters,

*σ*,

*α*,

*γ*,

*κ*, and

*β*. For

*m*≤ 100, the density

*ψ*(

_{m}*r*) was approximated by Monte Carlo simulation, discretizing over 10

^{3}bins. For larger

*m*, a Gaussian approximation to Equation 14 was used (Mardia & Jupp, 2009):

*fminsearch*in MATLAB). Note that, as a mixture of normal distributions of different widths, the distribution of error is, in general, not normally distributed.

*M*= 100 neurons, 10

^{5}repetitions per subject and contrast) using parameters obtained by fitting the model to the experimental data. Note that previous work (Bays, 2014) has shown 100 neurons to be sufficient to approximate the large population limit

*M*→ ∞; simulating larger numbers of neurons would not have changed the results. Simulated trials were split into two equal bins, according to either the precision of the posterior distribution

*p*(

*θ*|n) or the total spike count

*T*in response to a stimulus of contrast

*c*is

*p*, or not seen, with probability (1 −

*p*), where

*p*depends on stimulus contrast. Seen stimuli are reported with circular normal (Von Mises) distributed error with

*SD σ*and bias

_{seen}*β*. When the stimulus is not seen, the response is random (i.e., drawn from a uniform distribution). The result is a mixture distribution with density where VM(

*θ*,

*μ*,

*σ*) is the Von Mises distribution evaluated at

*θ*with mean

*μ*and

*SD σ*. This resulted in a model with six free parameters:

*σ*,

_{seen}*β*,

*p*

_{50%},

*p*

_{100%},

*p*

_{200%}, and

*p*

_{400%}. Models were compared using the Akaike information criterion with finite data correction (AICc) and Bayesian information criterion (BIC).

*σ*of the initial normal representation. This model therefore had seven free parameters:

*σ*

_{50%},

*σ*

_{100%},

*σ*

_{200%},

*σ*

_{400%},

*β*,

*κ*, and

*ξ*.

*η*. In this case, the response of the

*i*th neuron is given by and Equations 2–4 hold as before. The model of detection is the same as above except that the no-stimulus epoch contained spikes generated at the baseline rate

*η*, and activity in the stimulus epoch was given by Equation 22.

*V*> 6.9;

*t*(7) > 2.6,

*p*< 0.032. Significant deviations from circular normality were evident as long tails in the error distribution at detection threshold (100%): circular kurtosis of 2.7 greater than circular normal with matched variance;

*t*(7) = 2.8,

*p*= 0.026; also in eight out of eight subjects considered individually. Figure 2b plots the discrepancy between the error distributions generated by observers and a circular normal distribution with the same variance.

*β*= −0.050 rad ± 0.028 rad, tuning width

*κ*= 2.40 ± 0.58, population gain

*γ*= 145 Hz ± 92 Hz, contrast response parameters

*α*= 48.2 ± 16.6,

*σ*= 0.096 ± 0.0081; goodness of fit:

*r*

^{2}= 0.64 ± 0.14

*SD*). The model reproduced both the changes in distribution width with contrast and, importantly, the non-normality of errors around detection threshold. Figure 3 plots response precision as a function of contrast for experimental data (black symbols) and the fitted model (black line).

*t*(7) = 0.19,

*p*= 0.86.

*β*= −0.044 rad ±0.027 rad, variability

*σ*= 0.43 ± 0.044, probability seen

_{seen}*p*

_{50%}= 0.048 ± 0.016,

*p*

_{100%}= 0.59 ± 0.11,

*p*

_{200%}= 0.89 ± 0.08,

*p*

_{400%}= 0.98 ± 0.013). The threshold model was a poorer fit to the experimental data according to model selection criteria (ΔAICc = 12.6; ΔBIC = 43.5).

*β*= −0.062 rad ±0.023, tuning width

*κ*= 17.2 ± 6.1, population activity

*ξ*= 13.5 ± 9.4, normal

*SD σ*

_{50%}= 4.0 ± 0.94,

*σ*

_{100%}= 1.7 ± 0.74,

*σ*

_{200%}= 0.52 ± 0.14,

*σ*

_{400%}= 0.48 ± 0.11). The two-stage model was a substantially poorer fit to the experimental data than the population coding model (ΔAICc = 327; ΔBIC = 390).

*r*

^{2}= 0.03,

*t*(7) = 1.0,

*p*= 0.34; 100% contrast,

*r*

^{2}= 0.17,

*t*(7) = 4.6,

*p*= 0.003; 200% contrast,

*r*

^{2}= 0.07,

*t*(7) = 3.0,

*p*= 0.020; 400% contrast,

*r*

^{2}= 0.05,

*t*(7) = 3.6,

*p*= 0.009.

*MSE*0.073 ± 0.023).

*r*

^{2}= 0.42). A median split based on total spike count (dashed lines in Figure 3) produced a replication of behavioral results that was indistinguishable from posterior precision,

*MSE*0.062 ± 0.015,

*t*(7) = 1.3,

*p*= 0.23.

*η*. This model is considerably less analytically tractable than the no-baseline (

*η*= 0) model, and numerically fitting it to the experimental data is impractical. However, the predictions of the model share all the main characteristics of the no-baseline case. To illustrate the similarity, I considered the case

*η*= 1 Hz. Taking as a starting point the ML parameters of the no-baseline model for a representative observer, I used a grid-search (10 × 10 parameter space, 10

^{5}repetitions,

*M*= 100) to seek new values of

*κ*and

*γ*for which the baseline model approximated the predictions of the no-baseline model. As shown in Figure 6 and consistent with previous results (Bays, 2014), the baseline model generated predictions that were almost indistinguishable from those of the no-baseline model but at higher gain (

*γ*= 41.7 Hz, compared to 28.8 Hz in the no-baseline case) and based on broader tuning curves (

*κ*= 1.30, compared to 2.12).

*η*= 1 Hz was negligible (

*p*< 0.0001). This demonstrates that guessing is not critical to generating the non-normal distributions of error observed here but is rather an artifact of the simplified neuronal model lacking baseline activity.

*η*. I used Monte Carlo simulation (discretizing contrast into 100 bins; 10

^{5}repetitions,

*M*= 100) to estimate the threshold contrast, which again closely approximated the empirical threshold (101% of empirical value for the representative observer). Although in the no-baseline case all errors were due to guesses when no spikes occurred during the stimulus epoch, in the baseline model these trials occurred with negligible frequency (

*p*< 0.0001), providing further evidence that guessing is not a critical element of the population coding model.

*appearance*of a threshold in human perception because the long-tailed error distribution observed at low contrasts resembles a mixture of guessing and accurate judgments.

*, 48 (1), 217–237.*

*Journal of Neurophysiology**, 67 (1), 1–15.*

*Psychological Review**, 34 (10), 3632–3645.*

*Journal of Neuroscience**, 108 (11), 4423–4428.*

*Proceedings of the National Academy of Sciences, USA**, 10 (7), 1731–1757.*

*Neural Computation**, 13 (1), 51–62.*

*Nature Reviews Neuroscience**, 2 (8), 740–745.*

*Nature Neuroscience**. Cambridge, UK: Cambridge University Press.*

*Statistical analysis of circular data**, 47 (1), 85–107.*

*Vision Research**, 3, 1229.*

*Nature Communications**, 2 (11), 1527–1537.*

*Journal of Neuroscience**, 88 (4), 819–831.*

*Neuron**, 120 (3), 472–469.*

*Psychological Review**. New York: Wiley.*

*Signal detection theory and psychophysics**, 9 (2), 181–197.*

*Visual Neuroscience**, 104 (1), 539–547.*

*Journal of Neurophysiology**, 17 (11), 1899–1917.*

*Journal of the Optical Society of America A**, 9 (5), 690–696.*

*Nature Neuroscience**, 455 (7210), 227–231.*

*Nature**, 324 (5928), 759–764.*

*Science**, 39 (16), 2729–2737.*

*Vision Research**, 70 (1), 61–79.*

*Psychological Review**, 9 (11), 1432–1438.*

*Nature Neuroscience**. New York: John Wiley & Sons.*

*Directional statistics*(Vol. 494)*, 2 (9), 1508–1532.*

*JOSA A**, 1 (2), 125–132.*

*Nature Reviews Neuroscience**, 26 (1), 381–410.*

*Annual Review of Neuroscience**, 1 (1), 89–107.*

*Journal of Computational Neuroscience**, 15 (11), 720–728.*

*Psychological Science**, 90 (22), 10749–10753.*

*Proceedings of the National Academy of Sciences, USA**, 32 (2), 135–168.*

*Journal of Mathematical Psychology**, 4 (3), 304–310.*

*Nature Neuroscience**, 68 (5), 301–340.*

*Psychological Review**, 109 (22), 8780–8785.*

*Proceedings of the National Academy of Sciences, USA**, 50 (2), 179–197.*

*Acta Psychologica**, 64 (1), 25–31.*

*Biological Cybernetics**, 18 (10), 1509–1517.*

*Nature Neuroscience**, 13 (4), 447–456.*

*Network: Computation in Neural Systems*