**Sensitivity to luminance contrast is a prerequisite for all but the simplest visual systems. To examine contrast increment detection performance in a way that approximates the natural environmental input of the human visual system, we presented contrast increments gaze-contingently within naturalistic video freely viewed by observers. A band-limited contrast increment was applied to a local region of the video relative to the observer's current gaze point, and the observer made a forced-choice response to the location of the target (≈25,000 trials across five observers). We present exploratory analyses showing that performance improved as a function of the magnitude of the increment and depended on the direction of eye movements relative to the target location, the timing of eye movements relative to target presentation, and the spatiotemporal image structure at the target location. Contrast discrimination performance can be modeled by assuming that the underlying contrast response is an accelerating nonlinearity (arising from a nonlinear transducer or gain control). We implemented one such model and examined the posterior over model parameters, estimated using Markov-chain Monte Carlo methods. The parameters were poorly constrained by our data; parameters constrained using strong priors taken from previous research showed poor cross-validated prediction performance. Atheoretical logistic regression models were better constrained and provided similar prediction performance to the nonlinear transducer model. Finally, we explored the properties of an extended logistic regression that incorporates both eye movement and image content features. Models of contrast transduction may be better constrained by incorporating data from both artificial and natural contrast perception settings.**

*f*amplitude spectrum slope in natural images. Bex et al. (2007) found that the visual system's contrast response at a given spatial scale is moderated by spectral power at remote spatial scales (cross-scale gain control) in broadband natural scenes, and that the inhibitory influence of the gain pool critically depends on spatial alignment over frequencies (phase coherence). Bex, Solomon, and Dakin (2009) showed that when adapted to natural viewing conditions, sensitivity to contrast at low spatial frequencies is lower than when adapted to a homogenous field. This suggests that simple stimuli interleaved with blank screens overestimate contrast sensitivity for natural conditions. Furthermore, that study found that detection thresholds were relatively uncorrelated with local root-mean-square contrast; rather, higher local edge density was associated with threshold increases. Sinz and Bethge (2013) showed that including a temporal adaptation component, by which the contrast response normalization depended on the recent ambient contrast level, provided a more efficient code for the contrasts in natural images (in the redundancy reduction sense). They tested this model by simulating eye movements on a natural image database, producing a distribution of both similar ambient contrasts (caused by microsaccades) and large changes in ambient contrast (caused by saccades). Finally, Alam et al. (2014) recently showed that a contrast gain control model similar to that of Watson and Solomon (1997) predicted detection thresholds for a vertical log-Gabor target embedded in a natural-image patch remarkably well. Interestingly, the model's predictions were poor in image regions containing identifiable objects (including faces and body parts), suggesting the involvement of additional processing relying on feedback from recognition mechanisms. There is therefore scope for continued improvement of contrast-encoding models by the use of more natural stimuli and tasks.

**Figure 1**

**Figure 1**

**Table 1**

*p*(

*θ*|

*D*), where

*θ*are the parameters in the model and

*D*is the data. The posterior distributions also depend on the prior of the parameters

*p*(

*θ*) and more generally the selected model architecture. Compared to maximum-likelihood estimation, assessing quantitative models in a Bayesian framework holds a number of advantages, several of which we discuss briefly here. First, estimating the parameters of a model from data using maximum-likelihood methods produces a point estimate of the model parameters, which can be complemented with confidence regions and correlations based on distributional assumptions or resampling procedures. In contrast, the posterior distribution over model parameters provides rich information about the structure of the model and how the parameters relate to one another. Second, it allows uncertainty in the parameters to be preserved in all inferences by using multilevel (hierarchical) models, which are easier to estimate within a Bayesian framework than using maximum likelihood. Third, approximating the posterior distribution using sampling methods (see later) is an extremely general approach that offers great flexibility in the types of models that can be estimated efficiently. We include further discussion of our model fitting approach later in Model estimation and comparison.

*R̄*value (the ratio of between- to within-chain variance; see Gelman & Rubin, 1992) as well as by inspection of graphical estimates of convergence (Xavier Fernández i Marín, 2014). More details of the model specifications, including priors and sampling parameters, are provided in the Appendix.

^{1}these distributions and the correlations between them are shown in Figure 2. We consider the relationship between pixel intensity and performance in the Appendix.

**Figure 2**

**Figure 2**

**Figure 3**

**Figure 3**

**Figure 4**

**Figure 4**

*towards*the target (defined here as ±22.5° from the target direction) separately from

*other*directions. Figure 5A shows the time from the offset of the previous saccade until the onset of the target. There is little evidence for any relationship between these variables. Conversely, performance appears to depend quite strongly on how soon after the target onset observers made their next saccade. Figure 5B shows the time from the onset of the target until the onset of the next saccade. For saccades towards the target location, trials on which the saccade was initiated around 300 ms after target onset were associated with better performance. This relationship drops before improving again for trials where the next saccade started 1000 ms after the target onset. This may represent trials in which the eye remained relatively still (see Figure 4C). For saccades in other directions, performance seems to slowly decline as the time to saccade initiation increases, peaking around 600 ms, before reversing to join with the “towards” condition again by 1500 ms. The relative dip in performance around 600 ms for saccades in both directions may reflect the influence of geotopic mislocalization errors on performance (Dorr & Bex, 2013). Note that the timing data in Figure 5B correspond to the same variable used to split the direction plots in Figure 4D (columns).

**Figure 5**

**Figure 5**

*intrinsic dimensionality*of the video signal: the number of dimensions over which the signal changes in a certain spatial or temporal scale. For example, a zero-dimensional signal is one with no change in any dimension. A one-dimensional change could refer to a stationary edge or a spatially distributed luminance change over time. A two-dimensional change could refer to a corner: The intersection of two edges causes a change in both the x- and y-dimensions. A three-dimensional change refers to a transient corner (one that appears and then disappears) and can only be inferred from a 3-D (spatiotemporal) signal.

*H*has a value greater than 0, the signal is

*at least*intrinsically one-dimensional; if invariant

*S*is greater than 0, the signal is at least intrinsically two-dimensional; and if invariant

*K*is greater than 0, the signal is intrinsically three-dimensional. We average responses of these geometric invariants over the target patch; loosely, this can be thought of as an index of feature density (where features are edges in space and time). In order to capture features of different sizes, we computed the invariants on an anisotropic spatiotemporal pyramid with six spatial and three temporal scales (i.e., over the spatiotemporal movie signal). As for the contrast statistics, the invariants were computed on the unmodified video signal within a 2° × 2° region centered on the target location, linearly averaged over 14 frames of the video signal centered on the target's contrast peak. We additionally computed 2-D invariants (i.e., for static movie frames) and also invariants at the nontarget locations, but we do not present them in this article. They are provided in the data set online.

*K*computed over the three-dimensional video signal at three spatial scales and over a moderate temporal window (160 ms). The relationship between (log) feature intensity and performance for these invariants is shown in Figure 6. At all spatial scales, performance increases as the feature value increases above very low values. For invariant

*K*at coarse (0.375–0.75 cpd) and fine (12–24 cpd) spatial scales, the relationship between the feature value and performance then plateaus or even begins to decrease over the range of the bulk of the data. For three-dimensional change over a moderate spatial scale (1.5–3 cpd), the relationship between performance and log feature intensity continues to increase approximately linearly over the range of the data.

**Figure 6**

**Figure 6**

**Figure 7**

**Figure 7**

*ϕ*is the unit normal density function, Φ is the cumulative Normal density function, and

*m*is the number of alternatives in the forced-choice task (Hacker & Ratcliff, 1979). To avoid needing to evaluate this integral at every step of the MCMC process, we use a Weibull approximation to this function (see Appendix).

*c*is the contrast at the target location and spatial band and

*R*is the output response. The function has four parameters;

*z*is the semisaturation point, in effect determining the horizontal position of the curve with respect to contrast. If

*z*is 0, then there is no accelerating portion of the nonlinearity. The argument

*rmax*is a scaling factor that determines the maximum absolute response. The shape of the curve is determined by

*p*and

*q*(accelerating and decelerating portions). If

*q*is 0, the function is a familiar sigmoidal psychometric function with a 50% point of

*z*and a maximum asymptotic value of

*rmax*;

*p*determines the steepness of the function. If

*q*is greater than 0, the response continues to increase as a function of contrast with an exponent of

*q*once contrast is greater than the threshold (

*z*). The four parameters are related to one another in nontrivial ways (shown analytically by Haun, 2009; Yu, Klein, & Levi, 2003).

*p*and

*q*are often set to be around 2 and 0.4, respectively (e.g., Alam et al., 2014; Haun & Peli, 2013). These values give the dipper shape (see Solomon, 2009, for a broad tutorial) that provides a good fit to the facilitation effect found robustly in classical contrast increment detection tasks using sinusoidal gratings, as well as for broadband noise stimuli (Henning & Wichmann, 2007), 1/

*f*noise patterns (Haun & Essock, 2010), and static natural images (Bex et al., 2007). Note that the exponent values depend on task conditions (Haun & Essock, 2010; Kwon et al., 2008; Meese & Holmes, 2002; Wichmann, 1999; Yu et al., 2003). For example, Wichmann (1999) found that the exponents depended strongly on the exposure duration for grating stimuli. We do not consider these effects further here.

*i*:

*p*,

_{i}*q*,

_{i}*z*, and

_{i}*rmax*

*. The outcome of each trial (correct or incorrect) was assumed to be a Bernoulli random variable with probability given by our Weibull link function (Equation A2). The contrast*

_{i}*c*in Equation 4 used to calculate

*R*

_{ped}(the pedestal response) was the band energy at the target location in the target spatial frequency band, and the contrast for

*R*

_{ped+inc}was the

*R*

_{ped}contrast multiplied by the alpha level from the trial.

*q*for S1 and S2 (the subjects with the most data) have modes located at approximately 0.3 and 0.5; similarly, the posteriors for

*rmax*appear strongly bimodal. In addition, posterior distributions for

*p*are highly skewed, with the bulk of the distribution tending towards 0 but with a large range. These bimodalities are driven by correlations between the parameters of the model, a point that we return to later (Figure 9).

**Figure 8**

**Figure 8**

**Figure 9**

**Figure 9**

*rmax*was a uniform distribution ranging 0–100), in Transducer B the priors for

*p*,

*q*, and

*rmax*are extremely tightly centered over values of 2, 0.4, and 10, respectively. Correspondingly, in this model the posterior distributions for these three parameters remain similar to the priors (i.e., the data do not have great influence on the strong priors). Only the prior on

*z*(uniform 0–1) is the same as in Transducer A. Compared to the posterior for

*z*in Transducer A, the posterior for Transducer B is very tightly constrained; this is driven by the data and the fact that the other parameters are also relatively fixed.

*d*′ values of 1, 1.5, and 2.5. These curves are equivalent to threshold-versus-contrast (TvC) functions. For Transducer A it can be seen that the TvC functions rise smoothly as a function of the pedestal contrast. This is a simple masking function: As the pedestal contrast increases, a larger increment is required to reach the same level of performance. In contrast, Transducer B exhibits the characteristic dipper shape, in which performance first

*improves*relative to very low pedestal contrasts (i.e., detection) before rising sharply into a masking curve. The TvC functions for Transducer B are shaped this way because our strong priors force them to be. Interestingly, the dipper occurs at a high pedestal contrast relative to the range of the data, and the model over the bulk of the data (indicated by the red and blue density contours) shows a very different shape to Transducer A.

*z*and

*rmax*, which do show evidence of a moderate positive correlation. This is because the

*z*parameter was the only one given a broad prior, and its influence on the contrast response profile is traded off against

*rmax*.

*η*is given by where the

*β*values are weights and the

*x*values are the individual predictors. This can also be described in matrix notation as

_{n}*η*=

*β*

^{T}

**X**, where rows in

**X**are trials. The linear predictor is then passed through a modified inverse logistic link function where

*γ*is a lower bound of performance, in this case 0.25. The key difference between the GLM considered here and the earlier nonlinear transducer models is that the calculation of the internal response

*R*involving several exponents and a division (Equation 4) has been replaced with the simpler linear combination of predictor variables.

*single-level GLM*). Any pedestal or increment contrast values of 0 were set to the minimum nonzero value, then the log of the contrast was taken. The design matrix of the model fitted to the trials of each subject

*i*is then given by where ped is a vector of the log pedestal contrast on all the trials from subject

*i*, inc is the log increment contrast on each trial, the first coefficient

*β*

_{i}_{,0}is the intercept,

*β*

_{i}_{,1}and

*β*

_{i}_{,2}are the slopes of ped and inc, respectively, and

*η*is a vector of linear predictor values for each trial. We then normalized the design matrix by subtracting the mean and dividing by the standard deviation (i.e., z-scores were computed). Normalizing the predictors makes the model easier to interpret because the intercept represents the level of performance when pedestal and increment contrast were at their mean in the data, and the magnitudes of the coefficients are comparable since they are based on z-scores.

_{i}^{2}

*multilevel*extension of this model, such that each subject is considered as part of a population and the individual-subject

*β*coefficients are estimated concurrently with the mean and variance of the coefficients at the population level. Multilevel (hierarchical) models have the desirable property of creating

*shrinkage*(also called

*partial pooling*) between parameter estimates: Each subject's coefficient estimates are influenced by those of the other subjects to the extent suggested by the data, via the population variance term (Gelman, 2006; Gelman & Hill, 2007; Kruschke, 2011). Observer-level parameters with higher variance (greater uncertainty) will be pulled more strongly towards the center of the population distribution than observer-level parameters with low variance. This is a conservative influence on inference, since it can reduce false alarms. The multilevel models considered here are a more general version of mixed models, which have recently been advocated for analyzing psychophysical data (Knoblauch & Maloney, 2012; Moscatelli, Mezzetti, & Lacquaniti, 2012). The multilevel model is somewhat like estimating separate regressions for each subject, then conducting another regression on the individual-subject coefficients to derive population estimates, except that here all these parameters are estimated concurrently. In addition to the individual-subject

*β*

_{i}_{,}

*terms in Equation 7, the multilevel model has a population mean*

_{j}*μ*and standard deviation

_{j}*σ*for each predictor variable

_{j}*j*, whose posterior distributions are also estimated.

*β*is a normal distribution with mean 0 and standard deviation 2. Since the predictors are in standard units, this represents a broad a priori range of slope values. The two GLMs are developed in more detail in the Appendix.

**Figure 10**

**Figure 10**

^{3}The predictions of these atheoretical models do not make much sense outside the range of the data. For example, they predict that as the pedestal contrast becomes increasingly close to zero, increment contrast also approaches zero. This is not realistic, since we know that the contrast increment detection threshold is appreciably above zero—i.e., it must saturate at some lower bound. Nevertheless, note that within the range of the data of the present experiment (density contours in Figure 10C), the iso-performance contours are similar to those of Transducer A in Figure 8C.

**Figure 11**

**Figure 11**

^{4}The predicted proportion correct was the posterior mean, calculated by generating a prediction for each MCMC sample and then taking the mean over samples for each trial.

^{1}= 2). The average log likelihood is a more continuous measure of prediction performance than the area under the ROC, since it measures the distance of the data from the prediction rather than just the sign. Note that the baseline model performs at 0.5 for the area under the ROC curve metric because it provides no information on whether individual trials lie above or below the mean.

**Figure 12**

**Figure 12**

*K*calculated at a midrange spatial scale and across a midrange temporal window (the 1.5–3 cpd condition in Figure 6), the cumulative eye-movement distance during target presentation (Figure 4C), and the spatial band of the target (treated here as a factor, which therefore has five levels). In addition, the model contains an intercept term (which, when the predictors are normalized, is the intercept for the first level of target spatial band). These features yield 10 subject-level regression coefficients, of which four are the slopes of continuous covariates (pedestal, increment, invariant

*K*, and cumulative eye-movement distance). These continuous predictors are first converted to log units as before (replacing any 0 values with the minimum nonzero value), then normalized into z-scores to make the coefficients more interpretable. The remaining coefficients are offsets that cause the regression surface to shift without changing its shape.

*K*is negative, indicating that as invariant

*K*increases (as the target location grows in edge density), the probability of success decreases. This relationship has the opposite sign to the positive relationship observed for the same data (the 1.5–3 cpd condition) in Figure 6. This is because here the pedestal contrast is also included in the model, whereas Figure 6 considers only the univariate relationship between invariant

*K*and the response. We return to this point in the Discussion.

**Figure 13**

**Figure 13**

*K*). Performance decreases as pedestal increases, and decreases as edge density increases. This provides another view onto the model surface, which we return to in the Discussion.

**Figure 14**

**Figure 14**

*p*is strongly constrained to be around 2, then the

*rmax*parameter will not be much greater than 9 (see Figure 9, top right panel). This result could be anticipated from the dependencies between parameters in the model (Haun, 2009; Yu et al., 2003). In additional testing using simulated data and maximum-likelihood fitting (available upon request), we found that wildly different combinations of these four parameters can lead to similar likelihood estimates. Unless parameters are regularized and constrained using prior information, model parameters could be unstable but produce little variation in overall predictive performance: The unconstrained models are not uniquely identifiable from our data.

*c*; Nachmias & Sansbury, 1974; see also Foley & Legge, 1981; Wichmann, 1999), meaning that the shape of the TvC function depends on the performance level defined as threshold (see also García-Pérez & Alcalá-Quintana, 2007). Wichmann (1999) suggests that differentiating the models depended on using information over the entire psychometric function (see also Green, 1960). This result implies that researchers seeking to compare models of contrast processing should collect full psychometric functions for each pedestal contrast, rather than relying on adaptive methods that seek to estimate only the threshold (and do not constrain the slope).

*K*(loosely, a measure of edge density) was positively correlated with task performance: As edge density increased, so did performance (Figure 6). However, when this predictor was included in the expanded GLM, its coefficient became reliably negative. That is, as edge density increased, performance decreased.

*at a given level of pedestal contrast*. This result corroborates that shown by Bex et al. (2009), who found that threshold contrast for increment detection in static natural images was higher (i.e., sensitivity decreased) as the local edge density increased. Note however that making this comparison must be treated cautiously, since here the pedestal contrast and the local edge density are correlated. Trials in which the local edge density was low also tended to have lower pedestal contrasts. Therefore, differences in these estimates may be driven in part by a lack of data at high pedestal contrasts when edge density is low, and vice versa. Similarly, it is possible that edge density is related to the correlation in contrast across spatial scales, in that image patches with more edges show higher cross-scale contrast correlations. In a gain-control model, higher cross-scale contrast correlations would produce stronger masking, potentially accounting for the effect of edge density here.

^{5}

*does*change across spatial frequencies. A more complete model would account for this. The GLM also does not include many of the predictors we found to influence performance, such as the direction and timing of saccades relative to the target (Figures 4D and 5). More fundamentally, the GLM is an atheoretical model that makes only indirect assumptions about mechanisms of the early visual system and takes preprocessed feature vectors rather than image sequences as its input. On the other hand, GLMs can be extended to a multilevel framework (with a population level over subjects, as we have done here) relatively uncontroversially, whereas it is less clear how to specify population-level hyperpriors over the parameters of the nonlinear transducer. Overall, we believe it is useful to present the GLM model here to encourage further exploration and model comparison for this and similar data sets.

^{6}

*f*(i.e., closer on average to the natural scenes used in our experiment)? While Goris et al. do not examine responses of their model to 1/

*f*noise, the fact that the dipper returns in low-pass noise (Henning & Wichmann, 2007) suggests that 1/

*f*noise might also be expected to show a dipper effect, because the responses of channels sensitive to high frequencies are less masked in 1/

*f*noise relative to low-frequency channels and so could still be informative about the presence of the signal. This is indeed what Haun and Essock (2010) report: The dipper effect remains in 1/

*f*noise, albeit its depth is reduced and the masking part of the function is shallower relative to narrowband noise.

*The model*, for a Bayesian, is the combination of the prior distribution and the likelihood, each of which represents some compromise among scientific knowledge, mathematical convenience and computational tractability. … We make some assumptions, state them clearly, see what they imply, and check the implications” (2012, p. 20). We treat prior distributions not as subjective measures of personal beliefs but as a way to condition and restrict maximum-likelihood estimates similar to, for example, regularized regression (Hastie, Tibshirani, & Friedman, 2009). This approach allowed us to demonstrate the interdependence of the parameters in the models and test the effect of imposing stronger constraints on the parameters within the same modeling framework (see, e.g., Figure 9). We believe the flexibility and computational power of this approach offer many opportunities for useful application in vision science.

*fit*with

*parsimony*and by extension generalizability. A Bayes factor is the ratio of the marginal posterior probability of two models and includes a natural preference for parsimony, since models are penalized for complexity by spreading posterior mass over more parameters (dimensions in the posterior).

*m*-alternative forced-choice case) pieces of sensory evidence. Bex et al. (2007) provide a direct analogue of this scenario for detection in (static) natural scenes. In our analysis, we define the pedestal as the contrast in the unmodified video sequence at the target location. In contrast to the classical case, in our experiment the observer is never shown the pedestal-only interval, and the pedestals in the nontarget locations are not the same as that in the target location. Observers' judgments in our task could be conceived as a comparison to an internal standard for “naturalness.” The observer sees four intervals (possible target locations) and compares each one to what he or she expects the natural contrast in that location to be. One decision strategy is to pick the location that most violates the observer's expectation. Since contrast in natural scenes is correlated over space (Simoncelli, 1997; Tkačik et al., 2011; Zetzsche et al., 1993), the expectation could be based on the scene information surrounding the patch: The surrounding image structure may act as a spatially extended pedestal. If this is the case, we think it is reasonable to employ the analyses for classical contrast discrimination experiments as we have done here, even though the observer models are not the same.

*Proceedings of the Conference on Pattern Recognition and Image Processing*( pp. 218–223). Los Angeles: IEEE Computer Society Press.

*IEEE Transactions on Automatic Control*, 19 (6), 716–723.

*Vision Research*, 40 (19), 2661–2669.

*Vision Research*, 43 (24), 2527–2537.

*Optics Express*, 7 (4), 155–165.

*Journal of the Optical Society of America A*, 19 (6), 1096–1106.

*The Journal of Physiology*, 203, 237–260.

*Perception & Psychophysics*, 39 (2), 87–95.

*The Journal of Physiology*, 197 (3), 551–566.

*Vision Research*, 31 (6), 983–998.

*EURASIP Journal on Image and Video Processing*, 2009, 1–22.

*Journal of Neuroscience*, 24 (31), 6991–7006.

*Vision Research*, 36 (12), 1827–1837.

*On modeling data from visual psychophysics: A Bayesian graphical model approach*(Doctoral dissertation). Technische Universitaet Berlin.

*Journal of Neuroscience*, 33 (3), 1211–1217.

*Journal of the Optical Society of America A*, 14 (9), 2406–2419.

*Journal of the Optical Society of America A*, 4 (12), 2379–2394.

*Journal of the Optical Society of America A*, 11 (6), 1710–1719.

*Vision Research*, 21 (7), 1041–1053.

*Vision Research*, 46 (10), 1585–1598.

*Nature Neuroscience*, 14 (9), 1195–1201.

*Spatial Vision*, 20 (1–2), 5–43.

*Annual Review of Psychology*, 59 (1), 167–192.

*Vision Research*, 32 (8), 1409–1410.

*Visual Neuroscience*, 26 (1), 109–121.

*Technometrics*, 48 (3), 432–435.

*Data analysis using regression and multilevel/hierarchical models*. New York: Cambridge University Press.

*Statistics and Computing*, 24 (6), 997–1016.

*Statistical Science*, 7 (4), 457–472.

*British Journal of Mathematical and Statistical Psychology*, 66 (1), 8–38.

*Psychological Review*, 120 (3), 472–496.

*Vision Research*, 11 (3), 251–259.

*Vision Research*, 18 (7), 815–825.

*The Journal of the Acoustical Society of America*, 32 (10), 1189–1203.

*d*′ for

*M*-alternative forced choice.

*Attention, Perception & Psychophysics*, 26 (2), 168–170.

*Vision Research*, 60, 101–113.

*The elements of statistical learning*. New York: Springer.

*Contrast sensitivity in 1/f noise*(Doctoral dissertation). University of Louisville, Kentucky.

*Visual Neuroscience*, 9 (2), 181–197.

*Sitzberichte der kaiserlichen Akademie der Wissenschaften in Wien*.

*Mathematisch-naturwissenschaftliche Klasse*, 79, 137–154.

*Journal of Machine Learning Research*, 15 (Apr), 1593–1623.

*Journal of the American Statistical Association*, 90 (430), 773–795.

*Journal of the Optical Society of America A*, 1 (1), 107–113.

*Modeling psychophysical data in R*. New York: Springer.

*Doing Bayesian data analysis*. Burlington, MA: Academic Press/Elsevier.

*International Journal of Computer Vision*, 41 (1–2), 35–59.

*Bayesian cognitive modeling: A practical course*. Cambridge, UK: Cambridge University Press.

*Journal of the Optical Society of America A*, 70 (12), 1458–1471.

*Journal of the Optical Society of America A*, 22 (10), 2013–2033.

*Nature Neuroscience*, 8 (12), 1690–1697.

*ACM Transactions on Graphics*, 22, 896–907.

*Advances in Psychology Research, Vol. 34*( pp. 51–88). New York: Nova Science Publishers.

*Vision Research*, 42 (9), 1113–1125.

*Proceedings of the Royal Society B: Biological Sciences*, 274 (1606), 127–136.

*Proceedings of the Royal Society of London: B*, 216 (1204), 335–354.

*Vision Research*, 14 (10), 1039–1042.

*Proceedings of the IEEE*, 69, 529–541.

*Statistical Science*, 22 (1), 59–73, http://doi.org/10.1214/088342307000000014.

*Journal of the Optical Society of America A*, 7 (10), 2032–2040.

*Digital video and HDTV*. San Francisco: Morgan Kaufmann.

*Sociological methodology*(pp. 111–196). Cambridge, MA: Blackwell.

*Network: Computation in Neural Systems*, 10, 341–350.

*Nature*, 387 (6630), 281–284.

*Journal of Neurophysiology*, 108 (1), 324–333.

*Cognitive Science*, 28, 259–87.

*The Annals of Statistics*, 6 (2), 461–464.

*Proceedings of the 31st Asilomar Conference on Signals, Systems and Computers, Vol. 1*( pp. 673–678). Location: IEEE Computer Press.

*Bioinformatics*, 21 (20), 3940–3941.

*PLoS Computational Biology*, 9 (1), e1002889.

*Attention, Perception & Psychophysics*, 71 (3), 435–443.

*Vision Research*, 14 (12), 1409–1420.

*Network: Computation in Neural Systems*, 10 (2), 123–132.

*PLoS ONE*, 6 (6), e20409.

*Vision Research*, 37 (23), 3203–3215.

*Proceedings of the Royal Society of London*.

*Series B: Biological Sciences*, 265 (1412), 2315–2320.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 34 (6), 1080–1091.

*NeuroReport*, 10 (9), 1811–1816.

*Journal of the Optical Society of America A*, 14 (9), 2379–2391.

*Some aspects of modelling human spatial vision: Contrast discrimination*(Doctoral dissertation). University of Oxford.

*Vision Research*, 46 (8), 1520–1529.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*, 73 (1), 3–36.

*Vision Research*, 30 (7), 1111–1117.

*Digital images and human vision*(pp. 109–138). Cambridge, MA: MIT Press.

^{1}Note that these are standard deviations, so in the contrast-processing model presented in Equation 4,

*p*= 2 is a type of energy detector.

^{2}Note that this means we are estimating more parameters (the mean and variance for each predictor), so it is not the case that this model has only three parameters where the nonlinear transducers have four. Nevertheless, we believe the gain in interpretability is worth the extra complexity, which is standard practice when employing GLMs.

*p*,

*q*,

*z*, and

*rmax*(see Equation 4).

*z*was given a uniform prior bounded [0, 1], and

*rmax*was given a uniform prior bounded [0, 100]. Parameter

*p*was given a Gaussian (normal) prior with mean 2 and standard deviation 1, and parameter

*q*was given a normal prior with mean 0.4 and standard deviation 0.2. That is, the prior variances were set to the mean divided by 2. The means of these parameters were picked to conform to the standards used in the literature to produce the dipper effect, whereas the variance was picked to ensure that the priors were reasonably weakly centered over the means, giving the data the opportunity to influence the model. The lower asymptote

*γ*was fixed at chance performance, 0.25.

*z*, which retained the same [0, 1] uniform distribution. The parameters were bounded as in Transducer A. The prior over

*rmax*was a normal distribution with mean 10 and standard deviation 0.1, the prior over

*p*was normal with mean 2 and standard deviation 0.02, and the prior over

*q*was normal with mean 0.4 and standard deviation 0.004. That is, the prior variances were set to the mean divided by 100, which we intended to strongly constrain the posterior distribution around the mean values. While the value of 100 was picked arbitrarily, the results show that we achieved our intended target.

*β*

_{i}_{,}

*for subject*

_{j}*i*and predictor

*j*was bounded to [−5, 5] and given a normal distribution prior with a mean of 0 and a standard deviation of 2. These values were chosen to represent quite weak priors centered over 0 (i.e., no effect). Note that the predictors were standardized (z-transformed) after taking their log, so these coefficients represent the weighting of standardized log units.

*β*was unbounded. The population-level mean for each predictor variable

_{i,j}*j*is denoted

*μ*, and the population standard deviation for each predictor variable is denoted

_{j}*σ*. These hyperparameters were bounded between [−5, 5] and [0, 10], respectively. The population means

_{j}*μ*were given normal distribution priors with a mean of 0 and a standard deviation of 1. The standard deviation

_{j}*σ*priors were half-Cauchy distributions, with mean 0 and standard deviation 1. These values were picked to make the subject-level estimates close to one another, since the bulk of the prior density is centered over 0 (i.e., no variance between subjects). This represents a conservative assumption, common in psychophysics, that the subjects do not differ greatly. We feel this is appropriate here, since we have only five subjects and the number of trials from each subject varies.

_{j}*β*is given by where

_{i,j}*ε*is the offset for each subject and each coefficient. Each

_{i,j}*ε*was given a unit normal prior distribution.

_{i,j}*λ*and shape

*k*. These parameters were set to 1.545903 and 1.481270, respectively, by minimizing the squared difference between the Weibull curve and Equation 2. It offers a reasonable approximation over the range of informative

*d*′ values (see Figure A1).

**Figure A1**

**Figure A1**

**Figure A2**

**Figure A2**

*difference*in contrast between the absolute contrast of the target and the pedestal.

**Figure A3**

**Figure A3**