**One of the major goals of sensory neuroscience is to understand how an organism's perceptual abilities relate to the underlying physiology. To this end, we derived equations to estimate the best possible psychophysical discrimination performance, given the properties of the neurons carrying the sensory code.We set up a generic sensory coding model with neurons characterized by their tuning function to the stimulus and the random process that generates spikes. The tuning function was a Gaussian function or a sigmoid (Naka-Rushton) function.Spikes were generated using Poisson spiking processes whose rates were modulated by a multiplicative, gamma-distributed gain signal that was shared between neurons. This doubly stochastic process generates realistic levels of neuronal variability and a realistic correlation structure within the population. Using Fisher information as a close approximation of the model's decoding precision, we derived equations to predict the model's discrimination performance from the neuronal parameters. We then verified the accuracy of our equations using Monte Carlo simulations. Our work has two major benefits. Firstly, we can quickly calculate the performance of physiologically plausible population-coding models by evaluating simple equations, which makes it easy to fit the model to psychophysical data. Secondly, the equations revealed some remarkably straightforward relationships between psychophysical discrimination performance and the parameters of the neuronal population, giving deep insights into the relationships between an organism's perceptual abilities and the properties of the neurons on which those abilities depend.**

*precision*. The precision determines the expected performance on perceptual tasks, and so, if we can estimate the precision from the neuronal parameters, we can estimate task performance.

*Fisher information*. This limit is known as the Cramér-Rao bound (Rao, 1945; Cramér, 1946; see Dayan & Abbott, 2001, pp. 120–121). For a reasonable spike rate or number of neurons, the precision of the generic sensory coding model that we describe in this article is actually very close to the Cramér-Rao bound, so we can use the Fisher information as an approximation of the decoding precision. The Fisher information is calculated from the properties of the neurons, and therefore forms a bridge between the neuronal properties and perceptual performance.

*sequential ideal-observer analysis*. This approach, developed by Geisler (1989), calculates the efficiency with which the information at each stage is processed. Roughly speaking, efficiency is the proportion of the available information that the observer appears to use. If we have a sufficiently good model of processing up to a particular point in the processing stream, we can construct an ideal observer for processing the information known to exist at that point and then compare the ideal observer's performance against that of a real observer. An efficiency of 1 would mean that the real observer's performance matched that of the ideal observer, so no further information was lost beyond that point in the processing stream. We can proceed like this from very early stages, e.g., the optics of the eye, through to later stages, seeing at each stage what proportion of the available information is lost by later stages of processing. To perform this kind of analysis, we need to be able to calculate the best possible performance allowed by the information at each stage.

*proportions*of neurons of different types at a particular point in the processing stream, we can calculate the decoding precision up to a multiplicative factor; assuming constant efficiency of processing beyond that point, this gives us the expected variance of the decoded stimulus values (and hence discrimination thresholds) up to a multiplicative factor. As we will show, the neuronal parameters (or functions of the neuronal parameters) have multiplicative effects on the decoding precision and thus will have the same effect on the ideal decoder's performance as they will on the performance of the decoder with constant inefficiency.

^{1}using corresponding lowercase letters. Thus

*X*is a random variable representing the stimulus level, and its value is

*x. R*is a random variable representing a neuron's mean number of spikes, and its value is

*r*(

*x*), the output of the neuron's tuning function, which gives the neuron's mean spike count for a given stimulus

*x*(note that the tuning function's output is often measured in spikes per second, but we find it more convenient to use units of spikes, without making assumptions about the time period over which the neuron's output is integrated; this is equivalent to measuring the output in spikes per unit of time using units of time scaled so that the spike integration period is one unit).

*N*is a random variable representing the spike count of a neuron, i.e., the number of spikes produced by the neuron, and

*n*is its value on a particular trial. Because we will often be dealing with populations of neurons, we use bold letters

**N**and

**n**to represent vectors holding the spike counts of all the neurons in the population.

**N**is a random variable representing the population response, and

**n**is its value on a particular trial.

*r*(

*x*) for each neuron, which specifies the neuron's mean spike count for a given stimulus

*x*; (b) a random process that generates spikes at the given rate; and (c) a method of decoding the spike counts of the neurons to give an estimate of the stimulus. We now describe each of these processes in turn.

*r*

_{0},

*r*

_{max},

*z*,

*q*, and

*b*, which serve the same or analogous purposes in the two functions.

*contrast-response function*. It usually has a sigmoidal shape that is well described by the Naka-Rushton function, also known as the hyperbolic ratio function (Naka & Rushton, 1966; Albrecht & Hamilton, 1982): The variable

*c*is the contrast in linear (e.g., Michelson) units,

*r*

_{0}is the spontaneous firing rate in response to zero contrast, and

*r*

_{max}is the asymptotic increment from

*r*

_{0}as contrast increases. The term

*semisaturation contrast*, is the contrast for which the mean response exceeds

*r*

_{0}by

*r*

_{max}/2. We use a left subscript on

*r*(·) to indicate the form of the tuning function, in this case “N-R” for Naka-Rushton.

**Figure 1**

**Figure 1**

*x*is given by which gives Using Equation 3 to substitute for

*c*in Equation 1, we can re-express the Naka-Rushton function as a function of log contrast

*x*: where

*x*=

*z*. In what follows, whenever we use the term “contrast” without specifying the units, we mean log Michelson contrast. In all our simulations, we used log to base 10 (i.e.,

*b*= 10), but our equations are derived for any

*b*.

*z*to represent the stimulus value corresponding to the center (peak) of the Gaussian tuning function. This is analogous to the semisaturation contrast

*z*, which is at the center (peak of gradient) of the Naka-Rushton contrast-response function when expressed in terms of log contrast. The Gaussian tuning function is given by where

*x*is a value along the (unspecified) stimulus dimension,

*r*

_{max}is the maximum increment from the spontaneous firing rate

*r*

_{0}, and

*q*is a tuning sharpness parameter, analogous to the exponent

*q*of the Naka-Rushton function. As before, the left subscript on

*r*(

*x*)—in this case “Gauss” for Gaussian—indicates the form of the tuning function. Examples of the Gaussian tuning function are plotted in Figure 1C.

*σ*

_{tuning}is the standard deviation of the Gaussian tuning function, then its bandwidth (full width at half height)

*w*is given by

*w*

^{2}= (8ln2)

*x*is measured in units that are the log to base

*b*of the physical stimulus units, then

*w*will also be in log

*units. We can convert*

_{b}*w*to octaves (i.e., log

_{2}units) by multiplying by log

_{2}

*b*(this is because log

*× log*

_{b}y_{2}

*b*= log

_{2}

*y*). If

*ω*is the bandwidth in octaves, then Using Equation 8 to substitute for

*w*in Equation 7 gives

*r*(

*x*) multiplied by a gain value that changes randomly from trial to trial. We let

*G*represent the gain random variable, and

*g*its value on a particular trial. The spiking distribution conditioned on the gain is an ordinary Poisson distribution:

*σ*that is a free parameter. The gamma gain distribution and Poisson spiking distribution combine to produce a gamma-Poisson mixture distribution that has the form of a negative binomial distribution of spike counts, given by where Γ(·) is the gamma function. The distribution in Equation 11 fits well to the spike distributions obtained in physiological recordings (see figure 1c and 1d of Goris et al., 2014). Because the mean gain is 1, the mean of the spike distribution in Equation 11 is

_{G}*r*(

*x*).

*i*and

*j*is then given by where

*r*(

_{i}*x*) and

*r*(

_{j}*x*) are the outputs of the tuning functions of neurons

*i*and

*j*given a stimulus

*x*;

*i*=

*j*in Equation 12, then we obtain the variance of neuron

*j*:

*σ*. The covariance matrix itself still has a complicated structure because each term in the covariance matrix depends on

_{G}*σ*and the sensitivities of the two neurons to the stimulus, but all of this complexity can be reduced to a single scalar random variable

_{G}*G*with a single parameter

*σ*. Secondly, as explained earlier, we aim to calculate the performance of the decoder that makes best use of the encoded signals; and the best possible decoder will have access to the shared gain signal. If

_{G}*conditional*spiking probabilities (conditioned upon the gain signal) are independent. Thus, a decoder that knows the gain signal can express the spike distributions as independent Poisson distributions, which considerably simplifies both the maximum-likelihood decoding algorithm and our mathematical analysis of its performance. We also investigated the performance of two suboptimal decoders that did not know the gain signal. Loss of knowledge of the gain signal greatly impaired performance with the Naka-Rushton tuning function, but with the Gaussian tuning function, decoding performance was only slightly affected.

*σ*, both of these characteristics emerge naturally from Equation 17. The dependence of

_{G}*ρ*on tuning similarity is particularly noteworthy, since in our parameterizations of the model we have forced the Poisson and gain correlations

_{ij}**n**in the neurons being monitored by the observer: The vector

**n**holds the spike counts of all the different neurons. We investigated three methods for decoding these spike counts.

*x*that had the highest probability of giving rise to the obtained set of spike counts, i.e., the value of

*x*that maximizes the probability

*P*(

**N**=

**n**|

*X*=

*x*,

*G*=

*g*). Because of the statistical independence of the spiking distributions conditioned on the gain signal, we can write where

*K*is the number of neurons and the neurons are indexed by

*j*. The second equality (Equation 19) arises because each

*r*(

_{j}*x*) is a deterministic function of the stimulus value

*x*. For large populations, the product in Equation 19 can be too small to represent using floating-point values on a standard computer, so instead we calculated the logarithm of this value, which peaks at the same value of

*x*and is given by The probability in each term in Equation 20 is evaluated using Equation 10, and the decoder uses the simplex search method (Nelder & Mead, 1965) to find the stimulus value

*x*that maximizes this sum.

*x*that maximized the sum of log probabilities

*x*was that which maximized the sum of log likelihoods across all pairs of neurons. Thus, we decoded the population as if each pair of neurons were statistically independent from each other pair. As noted earlier, this decoding method takes account of pair-wise statistical dependencies but not higher order dependencies.

*r*

_{max},

*r*

_{0},

*q*, and

*σ*are each constant across different neurons, and the tuning curve centers

_{G}*z*are equally spaced along the

*x*-axis, with constant spacing

*δz*between values of

*z*

_{min}and

*z*

_{max}. We define a

*density*parameter

*h*equal to 1/

*δz*. The model thus has seven parameters:

*r*

_{max},

*r*

_{0},

*q*,

*σ*,

_{G}*h*,

*z*

_{min}, and

*z*

_{max}. Since each of these parameters is assumed to be constant across different neurons, we call this the Constant parameterization.

*x*is the logarithm of the physical stimulus level

*ξ*, then the model will behave according to Weber's law. The reason for this is that with all the parameters constant, if we move from a low stimulus value to a high stimulus value, we are faced with an identical decoding situation, just shifted along the

*x*-axis: The tuning functions have the same shape and density, and the noise properties are the same. Thus the standard deviation of the decoded value of

*x*will be constant across stimulus levels. This means that Δ

*x*, the just-noticeable difference in

_{θ}*x*, will be constant. If

*ξ*is the pedestal stimulus (i.e., the lower of the two stimuli to be discriminated) expressed in physical units, and Δ

_{p}*ξ*is the just-noticeable difference in physical units for a pedestal of

_{θ}*ξ*, then Δ

_{p}*x*= log

_{θ}*(*

_{b}*ξ*+ Δ

_{p}*ξ*) − log

_{θ}*(*

_{b}*ξ*). This can be rearranged to give Δ

_{p}*ξ*/

_{θ}*ξ*=

_{p}*x*. Thus, the Weber fraction Δ

_{θ}*ξ*/

_{θ}*ξ*is constant, which is the definition of Weber's law.

_{p}*h*,

*r*

_{max}, and

*q*can increase with the tuning function's position along the

*x*-axis. Specifically, we let

*h*,

*r*

_{max}, and

*q*vary as exponential functions of the tuning function position

*z*: The parameters

*k*,

_{h}*k*give the values of

_{q}*h*,

*r*

_{max}, and

*q*when

*z*= 0;

*m*,

_{h}*m*are parameters that determine how rapidly

_{q}*h*,

*r*

_{max}, and

*q*change as a function of

*z*. We call this the Exponential parameterization. It is a generalization of the Constant parameterization: The Constant parameterization is the Exponential parameterization with

*m*=

_{h}*m*= 0. For simplicity, we will assume that

_{q}*r*

_{0}/

*r*

_{max}is constant in the Exponential parameterization. As before,

*z*falls between

*z*

_{min}and

*z*

_{max}. Supplementary Appendix C shows how to generate a set of

*z*values when

*h*varies exponentially across the stimulus axis.

*X̂*) is closely approximated by the Fisher information. In this section, we derive expressions for the Fisher information for decoding the neurons when the gain is known. On each trial, the tuning function of each neuron in our model is multiplied by the gain signal

*g*, so the effective tuning function for neuron

*j*on that trial is

*gr*(

_{j}*x*). If the decoder knows the gain signal, then it knows the effective tuning functions

*gr*(

_{j}*x*) and it can express the spiking distributions as independent Poisson distributions. For a set of independent Poisson-spiking neurons with tuning functions

*gr*(

_{j}*x*), the Fisher information is given by where

*x*) is the first derivative with respect to

*x*of neuron

*j*'s tuning function (see Dayan & Abbott, 2001, chapter 3). Thus the variance of the estimated stimulus value

*X̂*on trials with stimulus

*x*and gain

*g*will be approximated by Over all trials with stimulus

*x*, the variance will be given by Thus, the precision

*τ*(

*x*) for decoding a stimulus with value

*x*is given by

*G*] = 1/(1 −

*τ*(

*x*) ≈

*J*(

*x*, 1 −

*modal*value of the Fisher information across trials, not the mean.

*x*is the log to base

*b*of the physical stimulus value

*ξ*. If the pedestal stimulus value is

*ξ*and the difference in physical stimulus values at threshold is Δ

_{p}*ξ*, then the Weber fraction

_{θ}*W*is defined as

*P*is the proportion of correct responses that defines the threshold level. We can then calculate the Weber fraction by using Relation 31 to substitute for

_{θ}*τ*(

*x*) in Relation 33.

*J*(

*x*, 1), the Fisher information that would occur when the gain is 1. Then we multiply this by 1 −

*τ*(

*x*), the overall precision when the gain varies with standard deviation

*σ*(Relation 31). We can then calculate the Weber fraction from the precision using Relation 33. The next two sections derive various expressions for

_{G}*J*(

*x*, 1) for different model parameterizations. The first of these two sections derives exact expressions; the second derives compact approximations of these expressions, which can often give a better insight into the way that the different neuronal parameters are related to perceptual performance in a population-coding model.

*r*(

_{j}*x*) in Equation 26, we get where the parameters on the right-hand side are the neuronal parameters of the Naka-Rushton function from Equation 4, and can vary from neuron to neuron (strictly speaking, each parameter should be indexed by the neuron number

*j*, but we omit these indices for clarity). The left subscript on

_{N-R}

*J*(

*x*, 1) indicates the form of the tuning function, in this case “N-R” for Naka-Rushton. When

*r*

_{0}= 0, Equation 34 reduces to

*I*(for “integral”) to represent them rather than

*J*.

*J*(

*x*, 1) in Relation 31 to predict the decoding precision.

*z*are constant, and

*δz*= 1/

*h*, we can rearrange the right-hand side of Equation 34 to give As

*δz*approaches zero, the right-hand side of Equation 38 can be approximated by an integral, which we call

*x*, 1). As long as the stimulus value

*x*is sufficiently far from the edges of the range of

*z*values, we can take the limits of

*z*to be ±∞, giving

*Q*(

*r*

_{0}/

*r*

_{max}) for 0 ≤

*r*

_{0}/

*r*

_{max}≤ 1, and it can be seen that

*Q*(

*r*

_{0}/

*r*

_{max}) smoothly decreases with increasing

*r*

_{0}/

*r*

_{max}. Thus, as

*r*

_{0}increases from 0,

*x*, 1) undergoes a multiplicative attenuation that is a function only of the ratio

*r*

_{0}/

*r*

_{max}. Because the attenuation is a function of the ratio

*r*

_{0}/

*r*

_{max}rather than

*r*

_{0}alone, we will often take this ratio, rather than

*r*

_{0}, to be a model parameter. We refer to this ratio as the relative spontaneous firing rate. When

*r*

_{0}= 0,

*Q*(

*r*

_{0}/

*r*

_{max}) = 1, and Equation 40 reduces to The ln(

*b*)/2 part of this expression is just a constant that depends on the arbitrary choice of base of logarithm that we use to represent contrast (and reduces to 1 for

*b*=

*e*

^{2}). The interesting part is

*r*

_{max}

*qh*: This is the simplest expression that we could possibly imagine, given that the Fisher information has to increase with increasing

*r*

_{max},

*q*, and

*h*. Equation 42 therefore reveals a remarkably straightforward relationship between the Fisher information and the neuronal parameters.

**Figure 2**

**Figure 2**

*x*, 1) (Equations 40 and 42) are independent of the stimulus value

*x*. From Relation 33, this leads to a constant Weber fraction, i.e., Weber's law.

*r*

_{0}= 0. Since all the parameters except

*z*are constant, and

*δz*= 1/

*h*, we can rearrange the right-hand side of Equation 37 to get When

*δz*approaches zero, the right-hand side of Equation 43 can be approximated by an integral, which we call

*x*, 1): The integral in Equation 44 is a standard definite integral, and Equation 44 reduces to

*r*

_{0}= 0 because we started with Equation 37. If instead we start with Equation 36, we obtain an expression that applies to all

*r*

_{0}:

*δz*approaches zero, the right-hand side of Equation 46 can be approximated by an integral: where

*S*(

*q*) in Relation 47, we obtain We can write

*S*(1) as Note that we were able to drop the

*x*that appears in the function being integrated, because this just shifts the function horizontally by a finite amount

*x*along the

*z*-axis but does not change its integral between infinite limits; so

*S*(1) is a function of

*r*

_{0}/

*r*

_{max}only. Unfortunately, for

*r*

_{0}> 0, we cannot find a closed-form expression for

*S*(1). However, for the range of relative spontaneous firing rates likely to occur, we have found that it can be very closely approximated by where the function

*Q*is defined in Equation 41. Supplementary Appendix F shows that, for 0 <

*r*

_{0}/

*r*

_{max}< 0.119, the approximation on the right-hand side of Relation 52 slightly overestimates

*S*(1) by a factor that never exceeds 0.7% of the true value; for

*r*

_{0}/

*r*

_{max}> 0.120, the right-hand side of Relation 52 underestimates

*S*(1), but not by much: Even for

*r*

_{0}/

*r*

_{max}as high as 1, the underestimation is only about 3%, and it is always less than 6% of

*S*(1). Using Relation 52 to substitute for

*S*(1) in Relation 50, we obtain our integral approximation

*x*, 1) for any

*r*

_{0}:

*b*)/2—Equation 53 is identical to Equation 40, which gives the corresponding expression for the Naka-Rushton tuning function. Similarly,

*x*, 1) is independent of

*x*, which leads to Weber's law.

*r*

_{0}= 0, so the Fisher information is given by Equation 35. For now, let us also assume that

*q*is constant, while

*h*and

*r*

_{max}vary with

*z*according to Equations 23 and 24. Then we can rearrange the right-hand side of Equation 35 to get Using Equation 24 to substitute for

*r*

_{max}in Equation 54, we have

*z*values so that the transformed values

*ζ*are equally spaced. Supplementary Appendix C shows that an appropriate transformation is given by giving Using Equation 57 to substitute for

*z*in Equation 55, we obtain As shown in Supplementary Appendix C, the definition of

*ζ*in Equation 56 causes the neurons to be separated in equal steps of size

*δζ*= 1/

*k*along the

_{h}*ζ*axis when

*h*varies exponentially with

*z*according to Equation 23. Thus we have As

*δζ*approaches zero, the right-hand side of Equation 59 can be approximated by an integral, which we call

*x*, 1): The limits of 0 and ∞ on the integral arise because, as before, we assume that the stimulus value

*x*is far from the ends of the range of

*z*values, and so the limits of

*z*are effectively ±∞; from Equation 56, as

*z*approaches −∞,

*ζ*approaches zero; and as

*z*approaches ∞,

*ζ*approaches ∞. In Supplementary Appendix G, we derive an expression for the integral in Equation 60. This integral has a finite solution if

*q*ln

*b*>

*m*+

_{h}*x*, 1) approaches

*x*, 1) as

*m*approaches zero, which is essential because, if the

*m*parameters are all zero, then the Exponential parameterization reduces to the Constant parameterization.

*x*: Specifically, when

*h*∝ exp(

*m*) and

_{h}z*r*

_{max}∝ exp(

*z*), the Fisher information is proportional to exp[(

*m*+

_{h}*x*]. Secondly,

*x*, 1) is a function of the sum of

*m*and

_{h}*m*and

_{h}*m*+

_{h}*q*is constant (i.e.,

*m*= 0). Allowing

_{q}*q*to vary with

*z*greatly complicates the integral, and we were unable to solve it, so instead we used an approximation. First, we extend the definition of

*γ*(Equation 63) as follows: Here,

*q*is treated as a function of

*x*, but using the parameters that define how it varies with

*z*(Equation 25); however, the approximation is good enough, because the performance for a log contrast of

*x*will be dominated by the neurons with

*z*close to

*x*. If we then use Equation 65 instead of Equation 63 to substitute for

*γ*in Equation 61, we obtain a very good approximation of the Fisher information when

*q*varies as an exponential function of

*z*.

*r*

_{0}= 0. We found that increasing

*r*

_{0}causes an approximately multiplicative attenuation that is close to the factor

*Q*(

*r*

_{0}/

*r*

_{max}) (Equation 41) derived for the Constant parameterization. Thus, even though the

*Q*function was not derived for the Exponential parameterization, we can borrow it to approximate the effect of nonzero

*r*

_{0}for this parameterization: where

*m*is given by Equation 62 and

*γ*is given by Equation 65.

*τ*(

*x*) using Relation 31. The precision was then used to calculate the Weber fraction

*W*, using Relation 33 with

*P*set to the threshold performance level that had been used in the psychophysical study. The predicted Weber fraction was multiplied by the pedestal level to give the predicted threshold (Δ

_{θ}*f*in Figure 3, or Δ

_{θ}*c*in Figures 6 and 7). We took the logarithms of these predicted thresholds and found the sum of squared differences between the predicted log thresholds and the log thresholds from the psychophysical data. The simplex algorithm adjusted the model parameters to minimize this sum.

_{θ}**Figure 3**

**Figure 3**

*f*and the frequency difference at threshold is Δ

_{p}*f*, then the plot of Δ

_{θ}*f*against

_{θ}*f*on log-log axes will be a straight line of gradient 1. There is only one degree of freedom in this plot—the vertical position—so we needed only one free parameter to fit the data. Our approach was to hold all the model parameters constant except for the density

_{p}*h*, and to fit

*h*to the psychophysical data.

*ω*to 1.5 octaves, which is close to the median value found physiologically (De Valois et al., 1982);

*q*was then calculated from

*ω*using Equation 9. We set

*r*

_{0}/

*r*

_{max}to 0.03 for all the modeling; the rationale for this choice was that Geisler and Albrecht (1997) found that the median

*r*

_{0}for a 200-ms stimulus was 0.17 for monkey V1 neurons, which is 0.03 when expressed as a proportion of median

*r*

_{max}= 5.7 spikes for neurons tuned to the stimulus. Different columns of panels in Figure 3 show results for different combinations of

*σ*and

_{G}*r*

_{max}(as indicated above each panel in the top row). The values of

*σ*= 0.2 and 0.4 are close to the mean values obtained by Goris et al. (2014) for awake and anaesthetized monkeys, respectively. The lower value of

_{G}*r*

_{max}= 4 was close to the median value (5.7 spikes) reported by Geisler and Albrecht (1997) for a 200-ms stimulus, and the purpose of the higher value (

*r*

_{max}= 16) was to show how this affects the size of the neuronal correlations and Fano factors.

*h*, we constructed a set of model neurons with

*z*equally spaced along the log spatial frequency axis between

*z*

_{min}= −0.3 and

*z*

_{max}= 1.7 at spacing

*δz*= 1/

*h*. Then we performed Monte Carlo simulations. The full details of the Monte Carlo simulations are given in Supplementary Appendix H, but briefly, they were carried out as follows. First, we sampled a large number of points along the stimulus (

*x*) axis. For each stimulus value

*x*, we used the stochastic spiking model to generate 10,000 sets of spike counts and decoded each set of spike counts to give 10,000 estimated stimulus levels. The decoding precision was then calculated as the reciprocal of the variance of the stimulus estimates (decoding precision for the Known Gain decoder is plotted as blue circles in the top row of panels in Figure 3). The stimulus estimates were also used to simulate a 2AFC discrimination task (described in detail in Supplementary Appendix H). 2AFC trials consisted of two stimulus presentations, each with a different randomly generated gain value

*g*. The lower stimulus value was the pedestal, and the higher value was the target. On each 2AFC trial, the model selected as the target the stimulus with the highest estimated value, and we found the proportion of correct responses for each combination of pedestal and target. For each pedestal, we fitted a Weibull psychometric function (May & Solomon, 2013) to the model's proportion-correct data and obtained a discrimination threshold from the fitted function as described in Supplementary Appendix H. The discrimination thresholds for the Known Gain decoder are plotted as blue circles in the bottom row of panels in Figure 3.

*x*, 1) using Relation 31 and Equation 36. Table 1 shows that the true decoding precision obtained from the Monte Carlo simulations with known gain differs from the predicted value by less than 0.5%. This close match confirms that the Fisher information gives a sufficiently close approximation of decoding precision to allow us to calculate model performance and gain insights into the relationships between physiology and behavior.

**Table 1**

*x*, 1) using Relation 33 with

*P*= 0.75. It was these predicted thresholds that were used to fit the model to Mayer and Kim's (1986) data in the first place, so it is no surprise that they fit well to Mayer and Kim's data. What is more important is how accurately the thresholds predicted from the Fisher information match those obtained from the Monte Carlo simulations with the Known Gain decoder (compare the red lines against the blue circles in the bottom row of Figure 6).

_{θ}*x*, 1), as defined in Equation 53. The thick gray lines in the bottom row plot the thresholds predicted from

*x*, 1). The integral approximation of the Fisher information differs from the true Fisher information by less than 1% (see Table 1) and provides a better insight into the relationship between psychophysical performance and the neuronal parameters than we get from

*x*, 1), which is a sum with one term for each neuron.

*σ*= 0.2,

_{G}*r*

_{max}= 4), the Bivariate decoder's performance was almost identical to the predicted value; even on the Bivariate decoder's worst condition (

*σ*= 0.4,

_{G}*r*

_{max}= 16), its precision was within 6% of the value predicted from the Fisher information. This is remarkably close, considering that the estimate based on the Fisher information assumes that the decoder knows the gain signal on each trial and is performing maximum-likelihood estimation; in reality, the Bivariate decoder does not know the gain signal and is not a maximum-likelihood decoder, as it can only take account of pair-wise statistical dependencies. The true maximum-likelihood decoder for the Unknown Gain situation would almost certainly yield a precision even closer to the predicted value.

**Figure 4**

**Figure 4**

**Figure 5**

**Figure 5**

**Figure 6**

**Figure 6**

**Table 2**

*k*, which gives the density for a stimulus value of

_{h}*x*= 0. Each column of Figure 7 uses a different choice of which other parameter(s) to fit. In column A, we fitted

*m*, so that

_{h}*h*varied with

*z*, and set the other

*m*parameters (

*m*) to zero so that

_{q}*r*

_{max}and

*q*were constant across the neurons (equal to

*k*, respectively). In column B, we fitted

_{q}*r*

_{max}varied with

*z*, and set the other

*m*parameters (

*m*and

_{h}*m*) to zero. In column C, we fitted

_{q}*m*, so that

_{q}*q*varied with

*z*, and set the other

*m*parameters (

*m*and

_{h}*m*parameters, subject to the constraint that

*m*=

_{h}*m*; thus, there were still only two degrees of freedom in the fit, but

_{q}*h*,

*r*

_{max}, and

*q*all varied with

*z*. In each parameterization, all the parameters that were not fitted were set to reasonable values (see the caption of Figure 7 for these values).

**Figure 7**

**Figure 7**

*m*parameter values in column D are approximately one third of the values obtained when only one of the three parameters was fitted, indicating that the contribution of all three of these parameters to the Fisher information is additive. The precise additivity of

*m*and

_{h}*m*, at least approximately.

_{q}_{N-R}

*J*(

*x*, 1) (Equation 34); thick gray curves show the precision and thresholds predicted from the integral approximation of the Fisher information

*x*, 1) (Equation 66). Thresholds were predicted from the Fisher information using Relation 33 with

*P*= 1 − 0.5/

_{θ}*e*= 0.816…, which was the performance level that defined the threshold in the study by Meese et al. (2006). These threshold predictions map out almost straight lines on the log-log plots, which fit well to the data from Meese et al.

*x*, 1) was exactly correct only when

*r*

_{0}=

*m*= 0 (i.e., zero spontaneous firing rate, and

_{q}*q*constant with respect to

*z*). When either

*r*

_{0}≠ 0 or

*m*≠ 0 (as is the case in each panel of Figure 7), the integral was intractable, so we used work-arounds to give an approximate expression. Nevertheless, Table 3 shows that the precision predicted from the integral approximations of the Fisher information never differed by more than about 6% from that predicted from the true Fisher information. Meanwhile, the actual precision from the Known Gain decoder was never more than 6% lower than that predicted from the Fisher information. As with our other simulations of contrast discrimination, the performance of the Unknown Gain decoders was much worse than that of the Known Gain decoder.

_{q}**Table 3**

*r*

_{0}(spontaneous spike rate),

*r*

_{max}(maximum increment in spike rate from

*r*

_{0}),

*q*(tuning sharpness), and

*z*(position along the stimulus axis). Tuning functions could be sigmoidal (Naka-Rushton) or Gaussian. The population of neurons was characterized by the density

*h*of neurons along the stimulus axis. The gain fluctuations (parameterized by the standard deviation

*σ*) had both neuron-specific and population-wide effects. The neuron-specific effects of the gain fluctuations were the Fano factors (Figure 4), and the population-wide effects were the spike-count correlations that resulted from having the neurons share the same gain signal (Figure 5); both effects arise from Equation 12, which describes the covariance matrix for a population of model neurons of this kind.

_{G}*g*. In this case, we can derive the Fisher information as if the neurons were statistically independent; we then take the Fisher information for the case of

*g*= 1 and multiply it by 1 −

*J*), which consists of a sum with one term for each neuron. The other kind of expression (represented by the letter

*I*) approximates this sum using an integral. These integral approximations are much more compact and can help to shed light on the relationships between psychophysical performance and the neuronal parameters.

*z*, and

*z*is distributed with constant density along the stimulus axis. Our integral approximations revealed some particularly simple relationships between Fisher information and the neuronal parameters for the Constant parameterization. For both Naka-Rushton and Gaussian tuning functions, if

*r*

_{0}/

*r*

_{max}is held constant, the Fisher information is proportional to

*r*

_{max}

*qh*(see Equation 40 for the Naka-Rushton function and Relation 50 for the Gaussian tuning function). For the Naka-Rushton tuning function, the Fisher information is proportional to a decreasing function of the relative spontaneous firing rate

*r*

_{0}/

*r*

_{max}, which we call

*Q*(see Equation 41 and Figure 2); as

*r*

_{0}increases, the Fisher information undergoes a multiplicative attenuation that is a function

*only*of the ratio

*r*

_{0}/

*r*

_{max}. The same effect holds to a very close approximation for the Gaussian tuning function (see Supplementary Appendix F). Another feature of the Constant parameterization is that the Fisher information is approximately constant across the stimulus axis (i.e., it is independent of

*x*). Because of this, if

*x*is the logarithm of the physical stimulus value, then the performance of the Constant parameterization will obey Weber's law (discrimination threshold proportional to pedestal). We used the exact Fisher information expressions to fit the Constant parameterizations of the model to real psychophysical data that conformed to Weber's law: Mayer and Kim's (1986) spatial frequency discrimination data (Figure 3) and the suprathreshold contrast discrimination data (Figure 6) of Bird et al. (2002). In all cases, the thresholds predicted from the Fisher information expressions gave excellent matches to the thresholds obtained from Monte Carlo simulations.

*r*

_{max,}

*q*, or

*h*to vary exponentially with

*z*. The rates of increase were determined by parameters

*m*, and

_{q}*m*, respectively. We call this the Exponential parameterization, and it is a generalization of the Constant parameterization (the Constant parameterization is the Exponential parameterization with

_{h}*m*=

_{q}*m*= 0). The integral approximation of the Fisher information (Equation 66) revealed two features that were not explicit in the exact expressions. Firstly, the Fisher information for the Exponential parameterization is an exponential function of the stimulus level: It is proportional to exp(

_{h}*mx*), where

*m*, regardless of their individual values. We also found that, as with the Constant parameterizations, we could closely model the effect of

_{h}*r*

_{0}by multiplying the Fisher information by

*Q*(

*r*

_{0}/

*r*

_{max}) as defined in Equation 41. We used the exact Fisher information expressions to fit the Exponential Naka-Rushton parameterization to the contrast discrimination data of Meese et al. (2006). The thresholds predicted from the Fisher information expressions gave a good match to the thresholds obtained from Monte Carlo simulations.

*x*in this case is given by where

*r*

_{1}(

*x*) and

*r*

_{2}(

*x*) are the tuning functions of the two neurons, and

*x*) and

*x*) are their first derivatives. If the tuning functions are Gaussian, then the slopes

*x*) and

*x*) will often be opposite in sign and will partially or completely cancel out in the third (subtractive) term of Equation 67. When

*x*falls exactly midway between the peaks of two identically shaped Gaussian tuning functions, the third term is zero and the Fisher information is the same as for two independent Poisson-spiking neurons (i.e., equivalent to

*σ*= 0); this gives an insight into why gain fluctuations have so little effect on decoding Gaussian-tuned neurons, even when the gain signal is unknown. On the other hand, the slopes of the Naka-Rushton functions are always positive and so they never cancel out in this way, so the third term of Equation 67 generally subtracts more from the Fisher information for Naka-Rushton-tuned neurons than it does for Gaussian-tuned neurons.

_{G}*q*= 2, and there was additionally a threshold applied to the output of the Naka-Rushton function. For each of the eight neurons,

*r*

_{max}and

*z*were free parameters, and Chirimuuta and Tolhurst adjusted these 16 parameters by hand to fit their psychophysical data. Our analytical approach allows us to reduce the parameter set and, more importantly, to understand exactly what contribution each parameter makes to the decoding precision. We can then fix most parameters to physiologically plausible values and adjust no more parameters than we need to fit the data (i.e., one parameter for fitting Weber's law and two parameters for fitting the near-miss to Weber's law).

*z*

_{min}) would vary greatly between different subpopulations tuned to different spatiotemporal frequency combinations. Even if the distribution of

*z*were flat, say, between

*z*

_{min}and

*z*

_{max}for each subpopulation, by pooling the subpopulations we would be adding together distributions with different lower limits, creating the graded drop-off that we see in the full population. A similar argument could be made regarding the upper end of the distribution.

*Journal of Neurophysiology*, 48, 217–237.

*Biometrics*, 7, 340–432.

*Network: Computation in Neural Systems*, 3, 213–251.

*Neural Computation*, 4, 559–572.

*Vision Research*, 33, 123–129.

*Neural Computation*, 2, 308–320.

*Neural Computation*, 4, 196–210.

*Nature Reviews Neuroscience*, 7, 358–366.

*Vision Research*, 27, 1915–1924.

*Neuron*, 74, 30–39.

*Journal of the Optical Society of America A*, 19, 1267–1273.

*Journal of the Optical Society of America A*, 20, 1253–1260.

*Vision Research*, 45, 2943–2959.

*Vision Research*, 43, 1983–2001.

*Nature Neuroscience*, 14, 811–819.

*Mathematical methods of statistics*. Princeton, NJ: Princeton University Press.

*Journal of the Optical Society of America A*, 18, 1016–1026.

*Theoretical neuroscience: Computational and mathematical modeling of neural systems*. Cambridge, MA: MIT Press.

*Nature*, 448, 802–806.

*Vision Research*, 22, 545–559.

*Neuron*, 82, 235–248.

*Science*, 327, 584–587.

*Psychological Review*, 96, 267–314.

*Visual Neuroscience*, 14, 897–919.

*Nature Neuroscience*, 17, 858–865.

*Psychological Review*, 120, 472–496.

*Visual pattern analyzers*. New York: Oxford University Press.

*Journal of the Optical Society of America A*, 17, 1899–1917.

*Zeitung für Naturforschung C*, 36, 910–912.

*Vision Research*, 21, 457–467.

*Journal of the Optical Society of America*, 70, 1458–1471.

*PLoS ONE*, 8 (10), e74815.

*Journal of Vision*, 15 (6): 10, 1–21, http://www.journalofvision.org/content/15/6/10, doi:10.1167/15.6.10.[Article]

*Journal of the Optical Society of America A*, 3, 1957–1969.

*Journal of Vision*, 6 (11): 7, 1224–1243, http://www.journalofvision.org/content/6/11/7, doi:10.1167/6.11.7.[PubMed] [Article]

*Journal of Vision*, 8 (11): 9, 1–8, http://www.journalofvision.org/content/8/11/9, doi:10.1167/8.11.9.[PubMed] [Article]

*Journal of Physiology*, 185, 536–555.

*The Computer Journal*, 7, 308–313.

*Vision: Coding and efficiency*(pp. 3–24). Cambridge, UK: Cambridge University Press.

*Vision Research*, 46, 4646–4674.

*Journal of the Optical Society of America A*, 16, 647–653.

*Bulletin of the Calcutta Mathematical Society*, 37, 81–89.

*Journal of the Optical Society of America*, 56, 1141–1142.

*Journal of Vision*, 11 (14): 9, 1–13, http://www.journalofvision.org/content/11/14/9, doi:10.1167/11.14.9.[PubMed] [Article]

*Journal of Neuroscience*, 28, 12591–12603.

*Journal of Vision*, 10 (14), 19, 1–16, http://www.journalofvision.org/content/10/14/19, doi:10.1167/10.14.19.[PubMed] [Article]

*Proceedings of the Royal Society of London B*, 216, 427–459.

*Vision Research*, 23, 495–505.

*Vision Research*, 40, 3145–3157.

*Proceedings of SPIE*, 2179, 127–141.

*Experimental Brain Research*, 60, 559–563.

*Biological Cybernetics*, 38, 171–178.

*Network: Computation in Neural Systems*, 13, 447–456.

*Understanding vision: Theory, models, and data*. Oxford, UK: Oxford University Press.

*PLoS ONE*, 6 (5), e19248.

*Nature*, 370, 140–143.

^{1}In this article, we use the word “trial” in two ways. Firstly, we use it in the way a physiologist would, to mean a stimulus presentation. Secondly, we use it to mean a trial on a two-alternative forced-choice psychophysical experiment, in which the observer is presented with two stimuli and has to make a response. To distinguish these two meanings, we always refer to the latter type of trial as a “2AFC trial.”