The standard psychophysical model of our early visual system consists of a linear filter stage, followed by a nonlinearity and an internal noise source. If a rectification mechanism is introduced at the output of the linear filter stage, as has been suggested on some occasions, this model actually predicts that human performance in a classical contrast detection task might benefit from the addition of weak levels of noise. Here, this prediction was tested and confirmed in two contrast detection tasks. In Experiment 1, observers had to discriminate a low-contrast Gabor pattern from a blank. In Experiment 2, observers had to discriminate two low-contrast Gabor patterns identical on all dimensions, except for orientation (−45° vs. −45°). In both experiments, weak-to-modest levels of 2-D, white noise were added to the stimuli. Detection thresholds vary nonmonotonically with noise power, i.e., some noise levels improve contrast detection performance. Both simple uncertainty reduction and an energy discrimination strategy can be excluded as possible explanations for this effect. We present a quantitative model consistent with the effects and discuss the implications.

^{2}. Viewing distance was 120 cm, leading to a pixel-size of .009° of visual angle.

^{−7}deg

^{2}noise power spectral density. Noise power spectral density is defined as the variance in luminance (relative to the space-average luminance) multiplied with the pixel area, expressed in visual degrees squared. It represents the average power at the different frequencies present in the noise. The maximal amount of clipping (i.e., pixels set to the minimal or maximal luminance values because of the limited dynamical range of the monitor and video card) at the highest noise level was below 2%. Estimates of the effective images ensured that the nonlinear monitor operations (i.e., power saving functions, gamma correction, luminance rounding and the gamma function) did not distort the spectral properties of the noise stimuli.

*stimulus theory*describes how a transduction mechanism maps physical stimuli to internal states; second, a probabilistic

*theory of internal states*describes the probability distribution of the internal states that results from repeated presentation of the same stimulus; and finally, a deterministic

*response theory*describes a decision rule that maps internal states to a response.

*σ*

_{add}

^{2}, image sampling or calculation efficiency,

*k,*and template matching (Lu & Dosher, 2008). The parameter

*k*expresses the proportion of available information used by the observer and ranges between 0 and 1. Cross-correlating the noisy, sampled input image,

*I*

_{sampled}, with an optimal signal template,

*T*

_{signal}, transforms the 2-D input stimuli to 1-D responses,

*R*

_{k}, as given by Equation 1.

*R*

_{k}∣, were estimated via simulations with the noise and signal contrast levels used in our experiments as input. The scale of these responses depends on the image size used and therefore these responses were normalized by the filter response to a full contrast signal so that filter responses to a noiseless, unsampled signal became identical to the Michelson contrast of that signal. As explained above and illustrated in Figure 2, the aforementioned model components give rise to a linear relationship between image contrast and internal contrast representation. To describe the nonlinear mapping of stimulus contrast to internal contrast representation,

*R*(

*C*), the second part of the transduction mechanism consisted of the three parameter Naka–Rushton function (free parameters

*α, β,*and

*p*), which is illustrated in Figure 2v and given by Equation 2.

*k*) and the parameters of the Naka–Rushton equation (

*α, β,*and

*p*). Equation 3 expresses this transduction,

*t,*as a function of the signal contrast (

*C*) and the effective total noise spectral density (

*N*

_{total}) given a certain sampling value

*k*. Because the early, internal noise is additive,

*N*

_{total}is the sum of the external noise level

*σ*

_{ext}

^{2}and the early noise

*σ*

_{add}

^{2}.

*R*

_{k}∣, used in the expansive, i.e., the nominator, and the compressive, i.e., the denominator, parts of the Naka–Rushton function were the same. Although some evidence points to the existence of a broadly tuned contrast gain-control pool (e.g., Foley, 1994; Holmes & Meese, 2004), we opted to use only within-channel suppression in this model to avoid an increase of the number of free parameters.

*σ*

_{late}

^{2}) and signal-dependent source (free parameters

*γ*and

*ξ,*with

*ξ*a proportional constant and

*γ*an exponent).

*k, p,*and

*β*and fitted descriptive functions to the simulated variances. These descriptive functions allowed us to formalize the full model behavior.

*p*(

*C, N*

_{total}), as a function of the signal contrast (

*C*) and the effective total noise spectral density (

*N*

_{total}) in a 2-AFC task.

*z*is a dummy variable and

*f*(

*C, N*

_{total}) and

*g*(

*C, N*

_{total}) are given by

*σ*

_{add}

^{2},

*k, α, β, p, σ*

_{late}

^{2},

*γ,*and

*ξ*) and external noise level

*N*and signal contrast

*C*.

*σ*

_{late}

^{2}was taken to be 1, resulting in a seven free parameter model. An additional free parameter

*λ*(“lapse rate”) was introduced in the fitting of the model to avoid biased parameter estimates (Wichmann, 1999; Wichmann & Hill, 2001a). Priors were introduced for each parameter to constrain estimates to realistic values. To find the surface

*p*(

*C, N*

_{total}) that maximizes the likelihood that the data were generated from a process with success probability given by

*p*(

*C, N*

_{total}), the log-likelihood of the surface

*p*(

*C, N*

_{total}) given the parameters (

*σ*

_{add}

^{2},

*k, α, β, p, γ, ξ,*and

*λ*) was maximized using purpose-written software in MATLAB (

*fminsearch,*which makes use of the Nelder–Mead simplex search method). The log-likelihood of the surface

*p*(

*C, N*

_{total}) given parameter vector

*θ,*containing {

*σ*

_{add}

^{2},

*k, α, β, p, γ, ξ, σ*

_{late}

^{2}, and

*λ*} with

*σ*

_{late}

^{2}= 1 is given by Equation 7:

*n*

_{ji}the number of trials (block size) measured at noise level

*N*

_{j}and signal contrast

*C*

_{ji}and

*y*

_{ji}the proportion of correct responses in that condition. Because the problem is nonconvex due to

*λ,*a multistart procedure with pseudorandomly selected initial values was used to find the probable global minimum for each participant.

*χ*

^{2}-distributed, with degrees-of-freedom equal to the number of data blocks minus the number of free parameters if the model is correct and the observer's behavior were perfectly stationary during the whole experiment (such an observer would thus generate truly binomially distributed data). Often, due to a variety of reasons, this is not the case. Responses of nonstationary observers are more variable than binomially distributed data and thus lead to higher deviances (overdispersion). Wichmann (1999) has shown that, due to the typically relatively small number of measurements, the asymptotically derived distributions often fail to approximate the real distribution of

*D*for psychophysical data sets. The real distribution of

*D*can be estimated easily by means of Monte Carlo simulations. As suggested by Wichmann (1999), we estimated the distribution of

*D*for each model fit by means of 10,000 simulated data sets for an observer whose correct responses in our experiment are binomially distributed as specified by the model fit. From these simulations, we derived critical values for each reported fit.

*d*

_{i}is defined as the square root of the deviance value calculated for data point

*i*in isolation, signed according to the direction of the arithmetic residual

*y*

_{i}−

*p*(

*C*

_{ji},

*N*

_{i}). For binomial data, this is expressed by Equation 9.

*D*=

*d*

_{ji}

^{2}, as for RMSE. Systematic trends in deviance residuals indicate a systematic misfit of the model.

R.V. | B.B. | R.G. | I.P. | E.G. | H.H. | |
---|---|---|---|---|---|---|

σ _{add} ^{2} | 0.020 | 0.034 | 0.026 | 0.025 | 0.023 | 0.012 |

k | 0.007 | 0.013 | 0.013 | 0.012 | 0.010 | 0.008 |

α | 3.6e39 | 4.2e39 | 8.1e39 | 3.6e39 | 1.2e39 | 4.8e33 |

β | 0.054 | 0.054 | 0.054 | 0.056 | 0.056 | 0.14 |

p | 19.9 | 22.6 | 22.1 | 21.46 | 20.57 | 13.72 |

σ_{ late} ^{ 2} | 1 | 1 | 1 | 1 | 1 | 1 |

γ | 0.49 | 0.70 | 0.45 | 0.46 | 1.01 | 0.07 |

ξ | 6.6e19 | 4.5e19 | 7.3e19 | 5.4e19 | 2.1e19 | 1.2e17 |

λ | 0.025 | 0.022 | 0.035 | 0.024 | 0.053 | 0.013 |

D | 0.74 | 2.16* | 1.95* | 1.50 | 2.40* | 0.74 |

*any*other model) than to a systematic mismatch.

*σ*

_{add}

^{2}) is estimated to be approximately equal to the weakest external noise level that leads to a threshold rise. This is not inconsistent with the notion of “equivalent input noise” used in some linear detection in noise models (Lu & Dosher, 2008; Nagaraja, 1964). Sampling (

*k*) is estimated to be approximately 1%, which is in line with some other reported estimates (e.g., Legge et al., 1987). Figure 8 illustrates the effect of sampling on a signal, a noise and a signal plus noise image. As sampling decreases from 100% to 10% (i.e., one log unit), the response of an optimal filter to a signal image decreases by a log unit, while the response to a noise image only decreases by the square root of a log unit (i.e., approximately a factor of 3.16). The response to a signal plus noise image decreases with half a log unit (i.e., approximately a factor of 5) for the noise and contrast levels used in this example. The sampling parameter thus mainly serves to scale the (ratio of) responses to signal, noise, and signal plus noise images. Inefficiencies in the visual system need not be conceptualized as sampling. Alternatively, this parameter could be thought of as reflecting the use of a suboptimal filter, for instance a spatial-frequency-tuned channel that has an effective bandwidth that is broader than the narrowband Gabor signal to be detected. As can be seen in Equations 1 and 2,

*α*is simply a rescaling parameter that determines the response range. More interestingly,

*β*reflects the semi-saturation contrast of the contrast response function (Heeger, 1992a, 1992b). For all observers but one (H.H.),

*β*is estimated to be in the vicinity of the 75% correct (noiseless) detection threshold. The estimates of the response exponent

*p*may seem fairly high compared to fits of the gain-control model to contrast discrimination data (e.g., Foley, 1994; Wichmann, 1999). This is a consequence of the use of external (and early internal) noise: Because early noise linearizes nonlinear systems (see discussion), the exponent of the accelerating part of the nonlinearity needs to be high enough to fit the experimentally observed noise benefit.

*γ*reveals that for three of six observers, the exponent of the level-dependent noise source is estimated to be approximately 0.5, which corresponds to a “neural” noise scheme with noise variance proportional to the mean firing rate (e.g., Geisler & Albrecht, 1997). For one observer,

*γ*is close to one, which corresponds to the standard deviation being proportional to the mean response. With high

*α*-estimates, the proportional constant of the level-dependent noise source,

*ξ,*needs to have high values too to have any influence on the slope of the psychometric functions. Lapse rate (

*λ*) estimates are around 2%, which is not unusual (Wichmann & Hill, 2001a).