Neural population activity in sensory cortex informs our perceptual interpretation of the environment. Oftentimes, this population activity will support multiple alternative interpretations. The larger the spread of probability over different alternatives, the more uncertain the selected perceptual interpretation. We test the hypothesis that the reliability of perceptual interpretations can be revealed through simple transformations of sensory population activity. We recorded V1 population activity in fixating macaques while presenting oriented stimuli under different levels of nuisance variability and signal strength. We developed a decoding procedure to infer from V1 activity the most likely stimulus orientation as well as the certainty of this estimate. Our analysis shows that response magnitude, response dispersion, and variability in response gain all offer useful proxies for orientation certainty. Of these three metrics, the last one has the strongest association with the decoder’s uncertainty estimates. These results clarify that the nature of neural population activity in sensory cortex provides downstream circuits with multiple options to assess the reliability of perceptual interpretations.

*Macaca mulatta*, both 7 years old at the time of recording). Subjects were implanted with a titanium chamber (Adams, Economides, Jocson, Parker, & Horton, 2011), which enabled access to V1. All procedures were approved by the University of Texas Institutional Animal Care and Use Committee and conformed to National Institutes of Health standards. Extracellular recordings from neurons were made with one or two 32-channel S probes (Plexon), advanced mechanically into the brain with Thomas recording microdrives. Spikes were sorted with the offline spike-sorting algorithm Kilosort2 (Pachitariu, Steinmetz, Kadir, Carandini, & Harris, 2016), followed by manual curation with the “phy” user interface (https://github.com/kwikteam/phy). An example snippet of neural activity is shown in Figure 1B.

*R*

_{j}is the mean firing rate and θ

_{j}the orientation for the

*jth*stimulus. The OSI values shown in Figure 1D were directly calculated from the observed responses. We next studied the impact of our stimulus manipulations on the units’ orientation tuning. For this analysis, we fit four circular Gaussian functions (one per stimulus family) to the responses of each unit. The Gaussian quartet shared the same mode across stimulus families but could vary in their amplitude and bandwidth. The changes in response amplitude and selectivity reported in the Results section were calculated from the fitted functions. Changes in gain variability with stimulus manipulations were computed by using gain variability estimates from fitting the modulated Poisson model (Goris, Movshon, & Simoncelli, 2014) per unit and stimulus family. Finally, we asked how our stimulus manipulations impacted statistical response dependencies among pairs of neurons. For each pair of simultaneously recorded neurons, we estimated their “noise correlation” by computing the Pearson correlation between their responses after removing the effects of stimulus condition on response mean and standard deviation by

*z*-scoring responses.

*T*, exceeds the number of units,

*N*, by about a factor of 2:

*T*> (

*N*+ 2)/2. For some of our populations, this requirement is not met. To circumvent this violation, we combined stimulus conditions whose orientation differed by 180° since these only differed in their drift direction. For each population, we summarized the FI per stimulus family by computing the median across stimulus orientations. Finally, to assess the impact of noise correlations, we compared FI with shuffled FI, calculated using the method proposed by Kanitscheider et al. (2015a).

*b*, and the directional selectivity,

*d*:

_{o}is the filter’s preferred orientation, and parameter

*d*∈ [0, 1] determines direction selectivity. The function

*sgn*(·) computes the sign of the argument. Because spatial frequency was not systematically varied in our stimulus set, it is not possible to uniquely determine both α and

*b*from the neural responses we observed (Goris et al., 2015). As such, we set the derivative order to 2 unless the best-fitting aspect ratio reached an upper limit of 5—more extreme values correspond to spatial receptive fields that are atypically elongated for V1 (Goris et al., 2015). The filter’s stimulus response,

*f*(

*S*), was computed as the dot-product of the filter and stimulus profile in the orientation domain:

*S*), by subjecting the filter output to divisive normalization and passing the resulting signal through a power-law nonlinearity. This step also involves the inclusion of two sources of spontaneous discharge (one simply adds to the stimulus drive, and the other is suppressed by stimuli that fail to excite the neuron) and a scaling operation:

*e*

_{1}and

*e*

_{2}control the spontaneous discharge, γ the response amplitude,

*q*the transduction nonlinearity, while stimulus independent constant, β, and the aggregated stimulus-drive of a pool of neighboring neurons, ∑

_{j}

*f*

_{j}(

*S*), provide the normalization signal.

_{ϵ}(Goris et al., 2014; Hénaff et al., 2020). Under these assumptions, spike count variance, \(\sigma _N^2\), is given by

*t*is the size of the counting window and σ

_{G}the standard deviation of the response gain, given by

_{o}, spatial aspect ratio α, derivative order

*b*, and directional selectivity

*d*), 4 parameters controlling response range and amplitude (constant β, scalar γ, and maintained discharge

*e*

_{1}and

*e*

_{2}), 1 parameter for the nonlinearity (exponent

*q*), 1 parameter for the normalization noise (σ

_{ϵ}), and 1 final parameter that controlled the degree to which the normalization signal depended on stimulus dispersion. We computed the model prediction for every trial and used a Bayesian optimization algorithm to find the best-fitting parameters (Acerbi & Ma, 2017). An example model fit is shown in Figure 2A. To assess the goodness of fit, we computed the Pearson correlation between predicted and observed response mean and variance across all stimulus conditions (Figure 2B). Units were excluded from further analysis if the Pearson correlation fell below 0.5 for response mean or below 0.2 for response variance. In total, 352 out of 378 candidate units (93.1%) met this threshold.

*K*

_{i}} realized during a window of length Δ

*t*using a negative binomial distribution (Goris et al., 2014):

_{i}= μ

_{i}(

*S*)Δ

*t*and gain variability σ

_{G, i}are given by the stochastic normalization model (Equations 4 and 6).

_{S}, its spatial contrast

*c*

_{S}, and its orientation dispersion σ

_{S}. To obtain the orientation likelihood function for a given trial, we first calculated the likelihood of each possible parameter combination in a finely sampled 3D grid (orientation: [0:.5:360°], contrast: [0.07:.05:1.4], and dispersion: [1:3.4:99°]). We then marginalized across the contrast and dispersion dimension, yielding the orientation likelihood function (examples shown in Figure 2D). This function typically appeared Gaussian in shape (the Pearson correlation coefficient between the likelihood function and best-fitting Gaussian was on average

*r*= 0.985). We therefore used the peak of the best-fitting Gaussian as the point estimate of stimulus orientation. The width of the Gaussian defines the uncertainty of this estimate. Alternative ways to quantify both statistics that did not involve fitting a Gaussian function yielded highly similar values.

*R*

_{M}(which indicates certainty, the inverse of uncertainty); response dispersion,

*R*

_{D}; and cross-neuron gain variability,

*R*

_{G}, computed as

*M*is population size; \(\overline{\sigma _{\epsilon }}\), \(\overline{q}\), and \(\overline{\beta }\) are the mean values of the corresponding parameter estimates across all units (Equations 4 and 6); and

*s*

_{n}is a fixed scalar parameter used to relate the observed response magnitude to the unobserved stimulus-driven component of the normalization signal (its value was estimated through simulation of the stochastic normalization model).

_{G, i}and μ

_{i}are the expected gain-variability and response average of neuron

*i*and μ is the expected magnitude of the population response.

_{o}to evenly tile the orientation domain, and we randomly drew γ for each simulated unit from an exponential distribution (Baddeley et al., 1997). We simulated responses for our Experiment 1 with two levels of contrast, two levels of orientation dispersion, 16 orientations, and 35 repeated trials per stimulus condition. Each direct and inferred metric was computed from these population responses as in our experimental data and then compared to the ground-truth values from the simulations (results summarized in Figure 3A).

*z*-scored each metric and then rank-ordered the trials as a function of the metric we sought to control for. We considered nonoverlapping bins of 50 consecutive trials and computed the variance of the “frozen” metric. If this value did not exceed a threshold level of σ

^{2}= 0.005, we proceeded to calculate the Spearman correlation between the “test” metric and the likelihood width for that set of trials. This analysis yielded a distribution of partial correlation values, shown in Figure 4. We computed the average of each distribution on the Fisher

*z*-transformed values and then used the inverse transformation to map this back onto a scale from [–1, 1] (triangles in Figures 4B, C).

**Phase 1:**For each dataset, we trained 360 unique models that varied in the number of hidden layers (1, 2, or 3), number of hidden units per layer (10, 20, 30, 40, or 50), dropout rate between layers (0.05, 0.1, 0.15, or 0.2), weight decay (0.0, 0.001, or 0.001), and the learning rate (0.001 or 0.0001). For each configuration of hyper-parameters, we trained five networks on 80% of trials (training/validation set) and obtained a cross-validated prediction on the held-out 20% of trials, rotating trials between training and held-out set such that each trial had a cross-validated prediction. With this grid search, we determined the set of hyper-parameters that minimized each dataset’s held-out loss.

**Phase 2:**Optimal hyper-parameters in Phase 1 differed across datasets. We used this range of optimal hyper-parameters to train 96 networks on each dataset (each one using a different hyper-parameter configuration; hidden layers: 2 or 3; hidden units: 20, 30, 40, or 50; dropout rate: 0.05, 0.1, 0.15, or 0.2; weight decay: 0.0, 0.001, or 0.001; and learning rate: 0.001) and computed their held-out loss as before. These held-out losses represent an ensemble-based estimate (Lakshminarayanan, Pritzel, & Blundell, 2017) of the MLP model class’s predictive accuracy, which we report in our results.

*p*< 0.001, Wilcoxon signed rank test; median OSI reduction: 18.9%,

*p*< 0.001). Increasing stimulus dispersion also reduced the example unit’s response amplitude. In addition, this manipulation substantially broadened the tuning function (Figure 1E, red vs. blue lines). Again, these effects were exhibited by most units (median amplitude reduction: 31.6%,

*p*< 0.001; median OSI reduction: 64.2%,

*p*< 0.001).

_{G}, the larger the excess spike count variance. Our recent work suggests that the strength of gain fluctuations increases with stimulus manipulations that increase orientation uncertainty (Hénaff et al., 2020). Accordingly, we fit the modulated Poisson model to each unit separately per stimulus family (i.e., a specific combination of stimulus contrast and orientation dispersion). Lowering contrast and increasing dispersion both increased gain variability (median increase: 27.1%,

*p*< 0.001 for contrast, and 12.8%,

*p*< 0.001 for dispersion, Wilcoxon signed rank test), consistent with our previous observations in anesthetized animals (Hénaff et al., 2020).

*I*

_{θ}, see Methods). This statistic expresses how well stimulus orientation can be estimated from population activity by an optimal decoder and is inversely related to the uncertainty of this estimate (Paradiso, 1988). Consider the Fisher information profiles of an example population. Reducing stimulus contrast and increasing orientation dispersion both lowered

*I*

_{θ}, as is evident from the vertical separation of the colored lines (Figure 1G, left). These effects were present in all of our recordings (Figure 1G, right), though the exact impact differed somewhat across populations. Trial-to-trial response fluctuations are often correlated among neurons (Cohen & Kohn, 2011). These so-called noise correlations can affect the coding capacity of neural populations (Moreno-Bote et al., 2014; Kanitscheider, Coen-Cagli, & Pouget, 2015b). However, we found no systematic effect of our stimulus manipulations on the average strength of pairwise response dependencies (

*p*= 0.61 for contrast, Wilcoxon rank-sum test and

*p*= 0.10 for dispersion; Supplementary Figure S1A). Moreover, a shuffling analysis revealed that noise correlations had minimal impact on Fisher information estimates (see Methods; Supplementary Figures S1B, C). This result may in part be due to the relatively small sizes of our populations (Moreno-Bote et al., 2014). Nevertheless, we conclude that under these experimental conditions, our stimulus reliability manipulations substantially impact orientation coding in V1 because they alter the relation between stimulus orientation, on the one hand, and response mean and response variance, on the other hand. More generally, these observations indicate that, as expected, the perceptual uncertainty of an observer inferring stimulus orientation from these V1 population responses would increase as contrast decreases (Tolhurst et al., 1983; Mareschal & Shapley, 2004) and orientation dispersion increases (Goris et al., 2015; Beaudot & Mullen, 2006).

*r*= 0.76 on average and ranged from 0.96 for high-contrast, low-dispersion stimuli to 0.43 for low-contrast, high-dispersion stimuli; Supplementary Figures S3A, B). However, it did not track stimulus orientation perfectly. On some trials, the orientation estimation error could be substantial, especially when stimulus dispersion was high (Figure 2E, left). This pattern reflected an underlying structure in the distribution of estimation error. Specifically, as expected from a well-calibrated model, the spread of the estimation error approximately matched the width of the likelihood function (Figure 2E, right). Consequently, the average width of the likelihood function was strongly associated with the average size of the estimation error (Spearman’s rank correlation coefficient:

*r*= 0.91,

*p*< 0.001; Figure 2F) and with population Fisher information (Spearman’s rank correlation coefficient:

*r*= −0.81,

*p*< 0.001; Figure 2G). Together, these results establish the width of the likelihood function calculated under the stochastic normalization model as a principled metric of coding quality for our dataset. In the following analyses, we will use it as the “ground-truth” estimate of stimulus uncertainty on a trial-by-trial basis.

*r*= 0.52, 0.51, and 0.54, with 95% confidence intervals ranging from 0.31–0.69, 0.30–0.69, and 0.35–0.70; Figure 3B, left) but fell short of the predictive power of the width of the likelihood function (

*r*= 0.80, 95% confidence interval 0.68–0.88). By comparison, relative dispersion had a substantially weaker correlated with Fisher information (

*r*= 0.26, 95% confidence interval 0.02–0.47; Figure 3B, left). This pattern may in part reflect the limited size of our recorded populations. For each of these metrics, simulated direct estimates better approximate the ground truth as population size grows (Figure 3A). However, the speed of improvement differs across metrics, implying that some will be more hampered by our recording conditions than others. We therefore complemented this analysis with one in which we attempted to get better estimates of each metric by inferring them from the observed population activity while taking into account knowledge of each unit’s tuning properties (see Methods). In simulation, this procedure improves estimation accuracy for each metric (Figure 3A, full vs. dotted lines). Inferred response magnitude and gain variability each exhibited a modest association with Fisher information (

*r*= 0.55 and 0.63, with 95% confidence intervals 0.35–0.71 and 0.45–0.77) while response dispersion and relative dispersion tended to exhibited a weaker association (

*r*= 0.36 and 0.37, with 95% confidence intervals 0.12–0.57 and 0.13–0.56).

*r*= 0.68, 95% confidence interval of the mean 0.54–0.83), response dispersion (

*r*= 0.54, 0.36–0.72), and gain variability (

*r*= 0.81, 0.74–0.84), but not for relative dispersion (

*r*= 0.46, 0.27–0.64; Figure 3C, left).

*r*= 0.63, 95% confidence interval 0.49–0.77 for response magnitude; 0.54, 0.30–0.77 for response dispersion; 0.69, 0.58–0.79 for gain variability; and 0.55, 0.25–0.82 for relative dispersion; Figure 3C, right).

*r*= 0.53 for Experiment 1 and 0.32, for Experiment 2, Figures 4B, C left). Conversely, freezing gain variability all but nullified the association between response magnitude and likelihood width. This was true of Experiment 1 (

*r*= 0.04; Figure 4B, left) and Experiment 2 (

*r*= 0.11; Figure 4C, left). We found a similar asymmetric pattern for gain variability and response dispersion. Freezing response dispersion did little to the predictive power of gain variability (

*r*= 0.63 for Experiment 1 and 0.40 for Experiment 2, Figures 4B, C middle), but freezing gain variability removed most of the association between response dispersion and likelihood width (

*r*= 0.04 for Experiment 1, difference with gain variability:

*p*< 0.001;

*r*= 0.10 for Experiment 2,

*p*< 0.001; Figures 4B, C, middle). For all comparisons, the association between gain variability and stimulus uncertainty was greater when holding other metrics frozen than the association between other metrics when holding gain variability frozen (

*p*< 0.001, Wilcoxon rank-sum test; Figures 4B, C, left and middle). This approach also showed that response magnitude was more associated with stimulus uncertainty than response dispersion (Figures 4B, C, right;

*p*< 0.001, Wilcoxon rank-sum test). Overall, this pattern suggests that out of these three candidate metrics, gain variability has the most direct association with stimulus uncertainty. A complementary analysis that sought to examine how the correlation of each candidate metric with likelihood width depended on the intermetric correlation further corroborated this conclusion (Supplementary Figure S5).

*r*= 0.61 vs. 0.63, with 95% confidence intervals ranging from 0.59–0.62 and 0.45–0.77). We next examined how well the ANNs predicted likelihood width on a trial-by-trial basis. For Experiment 1, we found that likelihood width was on average not significantly better predicted by the family of ANNs than by inferred gain variability (mean

*r*= 0.79 with 95% confidence intervals 0.75–0.83 for ANN predicted likelihood width, mean

*r*= 0.81, 0.69–0.90 for inferred gain variability; Figure 5B, left). For Experiment 2, we found that likelihood width was somewhat better predicted by ANNs than by inferred gain variability (mean

*r*= 0.77 with 95% confidence intervals 0.71–0.83 for ANN predicted likelihood width, mean

*r*= 0.69, 0.58–0.81 for inferred gain variability; Figure 5B, right). We conclude that gain variability captures much of the variance of perceptual uncertainty that can be captured by a simple transformation of sensory population activity.

*ought to*believe about stimulus orientation. We found that response magnitude, response dispersion, and variability in response gain all offer useful proxies for the certainty of stimulus orientation estimates. This was also true when fluctuations in uncertainty were not due to external factors but instead arose from internal sources. Of the metrics we considered, gain variability exhibited the most direct association with stimulus uncertainty.

*Advances in Neural Information Processing Systems,*30, 1834–1844.

*Journal of Neurophysiology,*106(3), 1581–1590. [CrossRef] [PubMed]

*Journal of Neuroscience,*22(19), 8633–8646. [CrossRef]

*Neuron,*89, 1305–1316. [CrossRef] [PubMed]

*Proceedings of the Royal Society of London. Series B: Biological Sciences,*264(1389), 1775–1783. [CrossRef]

*Sensory Communication,*1(1), 217–233.

*Vision Research,*46(1–2), 26–46. [PubMed]

*Neuron,*60(6), 1142–1152. [CrossRef] [PubMed]

*Neuron,*74(1), 30–39. [CrossRef]

*Journal of Neuroscience,*32(31), 10618–10626. [CrossRef]

*Nature Human Behaviour,*7(1), 142–154. [CrossRef] [PubMed]

*Nature Reviews Neuroscience,*13(1), 51–62. [CrossRef]

*PLoS Computational Biology,*19(6), e1011104. [CrossRef] [PubMed]

*Journal of Neuroscience,*39(37), 7344–7356. [CrossRef]

*Nature Neuroscience,*14(7), 811–819. [CrossRef] [PubMed]

*PLoS Computational Biology,*4(1), e27. [PubMed]

*Theoretical neuroscience: Computational and mathematical modeling of neural systems*. Cambridge, MA: The MIT Press.

*Frontiers in Neuroinformatics,*1, 6.

*Nature,*415(6870), 429–433. [CrossRef] [PubMed]

*Nature Reviews Neuroscience,*9(4), 292–303. [CrossRef] [PubMed]

*Nature Communications,*12(1), 3635. [CrossRef] [PubMed]

*Nature Neuroscience,*1, 146–154.

*Journal of Comparative Neurology,*201(4), 519–539. [CrossRef]

*Vision Research,*35(19), 2723–2730. [CrossRef] [PubMed]

*Annual Review of Psychology,*49, 585–612. [CrossRef] [PubMed]

*Neuron,*88(4), 819–831. [CrossRef] [PubMed]

*Nature Reviews Neuroscience,*25(4), 237–252. [CrossRef] [PubMed]

*Nature Neuroscience,*17(6), 858–865. [CrossRef] [PubMed]

*Journal of Vision,*18(8), 1–13, https://doi.org/10.1167/18.8.8. [CrossRef]

*Journal of Neuroscience,*37(20), 5195–5203. [CrossRef]

*Nature Neuroscience,*14(2), 239–245. [CrossRef] [PubMed]

*Visual Neuroscience,*9(2), 181–197. [CrossRef] [PubMed]

*Advances in Neural Information Processing Systems,*17(1), 293–300.

*The Journal of Physiology,*148(3), 574–591. [CrossRef] [PubMed]

*Nature Communications,*11(1), 2513. [CrossRef] [PubMed]

*Nature Neuroscience,*9(5), 690–696. [CrossRef] [PubMed]

*PLoS Computational Biology,*11(6), e1004218. [CrossRef] [PubMed]

*Proceedings of the National Academy of Sciences,*112(50), E6973–E6982. [CrossRef]

*Nature,*455(7210), 227–231. [CrossRef] [PubMed]

*Science,*324(5928), 759–764. [CrossRef] [PubMed]

*Perception as Bayesian inference*. Cambridge, UK: Cambridge University Press.

*Advances in Neural Information Processing Systems,*30.

*PLoS Computational Biology,*17(11), e1009517. [CrossRef] [PubMed]

*Nature Neuroscience,*26, 2063–2072.

*Journal of Neuroscience,*15(3), 1808–1818. [CrossRef]

*Nature Neuroscience,*9(11), 1432–1438. [CrossRef] [PubMed]

*Vision Research,*44(1), 57–67. [CrossRef] [PubMed]

*Nature Neuroscience,*17(10), 1410–1417. [CrossRef] [PubMed]

*Nature,*434, 387–391. [CrossRef] [PubMed]

*Journal of Neuroscience,*34(10), 3579–3585. [CrossRef]

*Nature,*381(6583), 607–609. [CrossRef] [PubMed]

*Neuron,*92(2), 530–543. [CrossRef] [PubMed]

*Advances in neural information processing systems*(Vol. 29).

*Journal of Vision,*5(5), 1. [CrossRef] [PubMed]

*Biological Cybernetics,*58(1), 35–49. [CrossRef] [PubMed]

*Neural Computation,*15(10), 2255–2279. [CrossRef] [PubMed]

*IEEE International Symposium on Information Theory (ISIT),*2463–2468.

*Advances in Neural Information Processing Systems,*27.

*Annual Review of Vision Science,*4, 287–310. [CrossRef] [PubMed]

*Journal of Neuroscience,*18(10), 3870–3896. [CrossRef]

*The Bell System Technical Journal,*27(3), 379–423. [CrossRef]

*Vision Research,*114, 56–67. [CrossRef] [PubMed]

*Annual Review of Neuroscience,*24(1), 1192–1216. [CrossRef]

*Nature Neuroscience,*8(2), 220–228. [CrossRef] [PubMed]

*Vision Research,*23(8), 775–785. [CrossRef] [PubMed]

*Nature Neuroscience,*18(12), 1728–1730. [CrossRef] [PubMed]

*Nature Neuroscience,*23(1), 122–129. [CrossRef] [PubMed]

*arXiv preprint*, arXiv:2202.04324.

*Nature Neuroscience,*5(6), 598–604. [CrossRef] [PubMed]

*PLoS Computational Biology,*19(7), e1011245. [CrossRef] [PubMed]

*Neural Computation,*10(2), 403–430. [CrossRef] [PubMed]