**To investigate the effect of background noise on visual summation, we measured the contrast detection thresholds for targets with or without a white noise mask in luminance contrast. The targets were Gabor patterns placed at 3° eccentricity to either the left or right of the fixation and elongated along an arc of the same radius to ensure equidistance from fixation for every point along the long axis. The task was a spatial two-alternative forced-choice (2AFC) paradigm in which the observer had to indicate whether the target was on the left or the right of the fixation. The threshold was measured at 75% accuracy with a staircase procedure. The detection threshold decreased with target length with slope −1/2 on log-log coordinates for target lengths between 30′ and 300′ half-height full-width (HHFW), defining a range of ideal matched-filter summation extending up to about 200′ (or about 16× the center width of the Gabor targets). The summation curves for different noise contrasts were shifted copies of each other. For the threshold versus mask contrast (TvN) functions, the target threshold was constant for noise levels up to about −22 dB, then increased with noise contrast to a linear asymptote on log-log coordinates. Since the “elbow” of the target threshold versus noise function is an index of the level of the equivalent noise experienced by the visual system during target detection, our results suggest that the signal-to-noise ratio was invariant with target length. We further show that a linear-nonlinear-linear gain-control model can fully account for these results with far fewer parameters than a matched-filter model.**

^{1}combining numbers of local filters in proportion to target size (Tyler & Chen, 2000). Some formulations thus simply use the square root of the number of local channels as the dominator of

*d*′ calculation (see Kingdom, Baldwin, & Schmidtmann, 2015). Such matched-filter behavior may also be called “ideal summation” behavior (Tanner & Jones, 1960; Tyler & Chen, 2000).

^{2}. The luminance of the monitor at each output setting were measured with a PhotoResearch PR655 radiometer, which was used to construct a linearized lookup table.

*r*and

*θ*were the radius and the angle of a pixel in the polar coordinates (as noted in Figure 1A);

*L*was the mean luminance of the display;

*c*was the contrast of the target;

_{t}*f*was the spatial frequency of 2.5 c/°;

*σ*and

_{r}*σ*were the scale parameters (standard deviation) of the Gaussian envelope along the radius and circumference respectively; and the center of the Gabor arc,

_{θ}*u*, was at +3° and −3° visual eccentricity for patterns presented on the left or right of screen respectively. The parameters

*σ*and

_{r}*σ*controlled the size of the stimuli. The value of

_{θ}*σ*was fixed at 0.14° of visual angle while

_{r}*σ*varied from 0.5 to 40° along the circumference. Thus, the targets were Gabor arcs with a range of lengths from 1° to 80° circumferential angle. This arrangement allowed the stimuli to remain at the same cortical magnification factor regardless their length (see Robson & Graham, 1981). For better comparison with the results of the previous studies, we subsequently specify the length of the targets by their half-height at full-width (denoted HHFW in Figure 1A) of the Gaussian envelope, which is (–ln(0.5)*2)

_{θ}^{0.5}*

*u**

*σ*= 7.06 *

_{θ}*σ*, where

_{θ}*σ*is in arc radians and HHFW is in degree of visual angle. Figure 1B illustrates what an observer might see in a high-noise trial.

_{θ}*c*. That is, the noise mask

_{m}*c*denotes the mask contrast and

_{m}*B*denotes sampling from a binomial distribution. The noise was updated for each trial.

_{10}(

*c*) = −∞, −26, −22, −18, −14, −10 dB) and 10 target sizes, making a total of 60 test conditions. The target contrast in each trial was determined by the Ψ threshold-seeking algorithm (Kontsevich & Tyler, 1999), which was to measure the threshold at 75% correct response level. There were 40 trials following two practice trials in each run. Each reported datum point was an average of four repeated measures. The sequence of test conditions and repetition were all randomized.

_{m}_{1}, a

_{2}, and a

_{3}are intercepts for each component. This model, with 18 free parameters for each observer, explains 97% of the variation in the averaged data with root mean squared error (RMSE) 0.97 dB overall. Removing the −1/2 slope component in the model degrades the best fit dramatically (RMSE 1.48,

*F*(12, 82) = 9.03,

*p*< 0.0001). Removing the −1 slope component but keeping others in the model reduces goodness-of-fit (RMSE 1.19,

*F*(12, 82) = 3.38,

*p*= 0.0005). Removing the −1/4 component, on the other hand, shows a barely significant effect on the fit (RMSE 1.13,

*F*(12, 82) = 2.4,

*p*= 0.01 = α). Thus, there was strong support for full summation (for small Gabor arc lengths) and weak support for attentional, or probability, summation (for large Gabor arc lengths).

*k*(N

_{eq}+ N

_{ext}), where E, or threshold energy, is the squared threshold; N

_{ext}, or the external noise, is the squared masking noise contrast; and

*k*, a scale constant, and N

_{eq}, the equivalent noise estimate, are the two free parameters for each curve. We estimated the elbow for each TvN function by fitting the equivalent noise model to each TvN function. Figure 5A shows the equivalent noise fits for the same data as in Figure 4A. The

*R*

^{2}of the fits was 0.97. We thus obtained a reliable estimation of the equivalent noise according to this standard model.

*t*(1) = −0.18,

*p*= 0.44 for YYH and

*t*(1) = −1.26,

*p*= 0.21 for DTJ). As a result, the TvN functions for different target sizes appeared as vertically shifted copies of each other. We thus conclude that there was no evidence of any dependence of the level of equivalent noise on target size. Nagaraja (1964), as re-analyzed by Pelli (1990), also showed that equivalent noise did not change with the area of a disk.

*k*(N

_{eq}+ N

_{ext}), where E, or threshold energy, is the squared threshold; N

_{ext}, or the external noise, is the squared masking noise contrast;

*k*, a scale constant, and N

_{eq}, is the equivalent noise estimate. Here, the equivalent noise should contain all the noise experienced by the system during the detection task except that from the external noise mask. At first glance, since the noise in the system increases with Gabor arc length, one would expect the “elbow” position to increase with target size (shown as the black dashed line in Figure 5B). However, this prediction requires an independent sampling of noise from the target and the mask, in which the noise from the target increases with size while that from the mask depends only on its contrast. This is possible only if the dominant noise in the system during target detection is intrinsic to the matched filter

*after*the sampling of the stimuli, or late noise. In this scenario, the increase of noise with size is solely due to the increase of the matched filter size. However, this late noise prediction is inconsistent with our result that, as shown in Figure 5, the equivalent noise estimate derived from the “elbow” position was the same for all target sizes we tested.

*d*′ being controlled by the template output divided by the square root of the sum of all stimulus-related noise sources. This model offers more flexibility in interpreting data than the conventional equivalent noise model. For instance, our result of vertical shift of the TvN functions could be explained as due to different sensitivities of the visual system to targets of different size without a change in the internal noise. Such sensitivity changes are also consistent with the decrease of threshold with target size. However, the Lu and Dosher model offers no explanation as to why the threshold reduction should have a slope of −1/2. To account for this, it would be necessary to implement a

*deus ex machina*with prior knowledge of the stimulus size to control internal noise sources to match to stimulus extent, which is an arbitrary construct in the absence of further assumptions.

*j*has a spatial sensitivity profile

*f*(

_{j}*x*,

*y*). The excitation of this linear filter to the

*i*-th image component

*C*

_{i}*g*(

_{i}*x*,

*y*), where

*g*(

_{i}*x*,

*y*) defines the contrast independent spatial variation and

*i*can be either the target or the mask in our experiment, is given as

*x*and

*y*, eq. (1) can be simplified to

*Se*is a constant defining the excitatory sensitivity of the mechanism to the stimulus, and

_{ji}*i*=

*t*or

*m*, for target and mask, respectively.

*E*

_{ji}*j*-th local filter is the excitation of the

*j*-th filter,

*E*, raised by a power

_{j}*p*, in which

*E*= Σ

_{j}

_{i}*E*

_{i,j}is the sum of excitations produced by all image components, and is then divided by a divisive inhibition term

*I*plus an additive constant

_{j}*z*. That is,

*I*is the summation of a non-linear combination of the excitations of all relevant filters to filter

_{j}*j*. This divisive inhibition term

*I*can be represented as

_{j}*S*

_{j,i}is the weight of the contributions from each image component to the inhibition term.

*k*-th second-order detector receives inputs from as many as

*n*local filters. The overall response of

_{k}*k*-th second-order detectors,

*T*, is simply

_{k}*σ*

_{a}^{2}], and (2) the external noise provide by the noise patterns [

*σ*

_{e}^{2}]. The variance of the internal noise is assumed to be constant for all local filters in the model. Thus, the noise experienced by each second-order detector from this source is

*n*

_{k}*σ*

_{a}^{2}, according to its pooling extent,

*n*.

_{k}*σ*

_{e}^{2}is proportional to the square of the contrast noise mask; that is,

*σ*=

_{e}*w*

_{m}*C*

_{m}^{2}, where

*w*is a scalar constant that determines the amount of contribution of the noise mask to the variance of the response. Pooling these two noise sources, the variance of the response distribution in the k-th second-order detector is

_{m}*T*

_{k,t+m}, and that to the mask alone,

*T*

_{k,m}, in at least one second-order detector is greater than the limitation imposed by the noise. In practice, we assume that the noise mask produces little excitation in the local filters and in turn negligible second-order responses. This second-order mechanism should be the one whose receptive field covers the whole target extent and nothing else. If a mechanism does not cover the whole target, its response will be smaller than the one that does. If a mechanism covers a larger area, then it would suffer the noise from the extra local filters but receive no extra responses produced by the target. Thus, one implication of our model is that

*the matched filter for the ideal summation is actually the second-order mechanism with the maximum signal-to-noise ratio*. We thus only need to consider the second-order mechanism that has the greatest response to the target. Thus, we can drop the subscript

*k*for this study and focus on the decision variable given by,

*d*′ reaches unity.

*Se*to the target,

*Se*, to 100 and the contribution the external noise mask,

_{t}*w*, to 1. We also found that we could set the inhibition exponent

_{m}*q*to 2 and the excitatory and inhibitory sensitivities of the noise mask,

*Se*and

_{m}*Si*, respectively, to zero without affecting the goodness of fit. The latter implies that the external noise mask had little effect on the mean response of summation mechanisms, but only increased the variability. In total, there were thus only five free parameters in the model for the whole data set for one observer: inhibitory sensitivity of the local detector to the target (

_{m}*Si*, Equation 5), excitatory exponent and additive constant of the nonlinear response function (

_{t}*p*and

*z*, respectively, in Equation 4), the level of interval noise (

*σ*in Equation 7) and scaling factor

_{a}*r*(Equation 8).

*C*= 0), the model can be simplified as

_{m}*d*′ further than that imposed by the noise. Thus, as illustrated in Figure 7A for averaged data, our multiple stage model (red solid curve), incorporating the contrast gain control that is ubiquitous in models of contrast masking, captures not just the ideal summation tendency (green dashed line) but also the deviation at the extremes. Thus, the prediction of our model is almost indistinguishable from that of the three-component model (blue curve) that incorporates mechanisms for complete, ideal and probability summation. Indeed, our model, with only five free parameters for each observer, performs much better than the conventional ideal summation model with six free parameters for each observer. Quantitatively, the differential Bayesian information criterion (ΔBIC, Wagenmaker, 2007) of the ideal summation model, relative to our model, was −14.32 (probability that this model being more likely than ours,

*p*= 0.0008) for YYH and −11.82 (

*p*= 0.0027) for DTJ; and the three component model, with 18 free parameters for each observer, had ΔBIC −22.5 (

*p*< 0.0001) for YYH and −86.3 (

*p*< 0.0001) for TJ. Thus, taking into account the number of free parameters, the two generic models performed much worse than our model.

*d*′ is based on the second-order matched-filter summation mechanism. This output cannot distinguish whether the variability in the signal is from the internal noise in the local detector or the external noise in the stimulus. That is, the integration of the noise sources occurs early in the local detectors. As a result the denominator of

*d*′ is independent of target size, as shown in Equation 9. Thus, the constant equivalent noise (Figure 7B) found in our TvN functions is evidence for the operation of such second-order summation.

*p*< 0.0001) for YYH and −40.6 (

*p*< 0.0001) for DTJ, the equivalent noise model was dramatically outperformed by our model.

_{k}from Equation 6 for noise level to form essentially a complete summation model (blue curve in Figure 3, See Spatial Summation in Results). Thus, it is not surprising that these models are much less likely than our model (ΔBIC = −18.42,

*p*= 0.0001 for YYH and −10.22 for DTJ,

*p*= 0.006).

*Journal of Vision*, 11 (14): 14, 1–16, https://doi.org/10.1167/11.14.14. [PubMed] [Article]

*Journal of Physiology*, 141 (2), 337–350.

*Nature*, 184, 1951–1952.

*Spatial Vision*, 12 (3), 267–286.

*Communications in Statistics-Simulation and Computation*, 28 (1), 177–188.

*PLoS One*, 5 (4): e9840.

*Frontiers in Psychology*, 5, 456.

*Journal of the Optical Society of America A, Optics, Image Science, and Vision*, 11, 1710–1719.

*Vision Research*, 39, 3855–3872.

*Signal detection theory and psychophysics*. New York, NY: Wiley.

*Vision Research*, 62, 57–65.

*Vision Research*, 24 (12), 1977–1990.

*Vision Research*, 27 (6), 1029–1040.

*Journal of Vision*, 15 (5): 1, 1–16, https://doi.org/10.1167/15.5.1. [PubMed] [Article]

*Vision Research*, 39, 2729–2737.

*Journal of the Optical Society of America*, 70 (12): 1458–1471.

*Journal of the Optical Society of America A*, 4, 391–404.

*Vision Research*, 38, 1183–1198.

*Psychological Review*, 115, 44–82.

*Proceedings.*

*Biological Sciences/The Royal Society*, 274 (1627): 2891–2900.

*Journal of Vision*, 12 (11): 9, 1–28, https://doi.org/10.1167/12.11.9. [PubMed] [Article]

*Journal of the Optical Society of America*, 54, 950–955.

*Journal of the Optical Society of America A*, 2 (9), 1508–1532.

*Vision: Coding and efficiency*(pp. 3–24). Cambridge, UK: Cambridge University Press.

*Journal of the Optical Society of America A*, 16 (3), 647–653.

*IRE Transactions in Information Theory PGIT*, 4, 171–212.

*Vision Research*, 39 (5): 887–895.

*Numerical recipes in C*. Cambridge, UK: Cambridge University Press.

*Vision Research*, 21 (3), 409–418.

*Journal of the Acoustical Society of America*, 30 (10), 922–928.

*Visual search techniques*(pp. 59–68). Washington, DC: National Academy of Sciences, National Research Council, Armed Forces NDC Committee on Vision.

*Vision Research*, 40, 3121–3144.

*Journal of Vision*, 6 (10): 11, 1117–1125, https://doi.org/10.1167/6.10.11. [PubMed] [Article]

*Psychonomic Bulletin & Review*, 14 (5), 779–804.

*Nature*, 302 (5907), 419–422.

*Extrapolation, interpolation and smoothing of stationary time series, with engineering applications*. Cambridge, MA: MIT Press.