Classical studies of area summation measure contrast detection thresholds as a function of grating diameter. Unfortunately, (i) this approach is compromised by retinal inhomogeneity and (ii) it potentially confounds summation of signal with summation of internal noise. The *Swiss cheese* stimulus of T. S. Meese and R. J. Summers (2007) and the closely related *Battenberg* stimulus of T. S. Meese (2010) were designed to avoid these problems by keeping target diameter constant and modulating interdigitated checks of first-order carrier contrast within the stimulus region. This approach has revealed a contrast integration process with greater potency than the classical model of spatial probability summation. Here, we used Swiss cheese stimuli to investigate the spatial limits of contrast integration over a range of carrier frequencies (1–16 c/deg) and raised plaid modulator frequencies (0.25–32 cycles/check). Subthreshold summation for interdigitated carrier pairs remained strong (∼4 to 6 dB) up to 4 to 8 cycles/check. Our computational analysis of these results implied linear signal combination (following square-law transduction) over either (i) 12 carrier cycles or more or (ii) 1.27 deg or more. Our model has three stages of summation: short-range summation within linear receptive fields, medium-range integration to compute contrast energy for multiple patches of the image, and long-range pooling of the contrast integrators by probability summation. Our analysis legitimizes the inclusion of widespread integration of signal (and noise) within hierarchical image processing models. It also confirms the individual differences in the spatial extent of integration that emerge from our approach.

^{2}and a refresh rate of 120 Hz.

*C*

_{dB}= 20log

_{10}(

*C*

_{%}), where

*C*

_{%}is Michelson contrast in percent, defined as

*C*

_{%}= 100(

*L*

_{max}−

*L*

_{min})/(

*L*

_{max}+

*L*

_{min}), where

*L*is luminance. Goodness of fit for models was assessed using the root-mean-square error (RMSe), defined as

_{ i }and data

_{ i }are the model predictions and empirical data points (in dB) for the

*i*th condition, and

*n*is the number of data points.

*basic method*, we measured thresholds for the “black” and “white” cheeses (the two components) and the full stimulus (the compound). Summation was estimated by subtracting the threshold for the full stimulus (in dB) from the lowest (best) of the two cheese stimuli (usually the “white” cheese threshold). This is equivalent to calculating the ratio of these two threshold contrasts when expressed in percent. In the

*normalization method*, the full stimulus was the sum of “black” and “white” cheeses as before, but to equate sensitivity to the two cheeses, the contrast of the cheese to which the observer was least sensitive was raised (based on the initial estimates of thresholds; e.g., see Baker, Meese, Mansouri, & Hess, 2007). Summation was calculated in the same way as above.

*within*the model filter elements. However, even for the large 8-lobe filter element—which is probably unreasonably large (Foley et al., 2007)—the breadth of summation is nowhere near enough to account for the experimental results: Spatial pooling of some sort must be involved.

*μ*) of 4 and 8, arguably each of which are good approximations to spatial probability summation (see 1). For the standard size filter element (solid curves), each of these formulations typically underestimated summation for all three observers—sometimes quite badly. For the larger filter element (dashed curves), the predictions for

*μ*= 4 were quite good for SAW and DHB but still underestimated summation for TSM (particularly in Figure 3f). Furthermore, as we have shown elsewhere (Meese & Summers, 2009), this formulation is also inconsistent with the steep slope of the psychometric function (not shown here). Thus, even when we constructed the model to favor probability summation as best we might (a fairly large filter element with a linear transducer and a fairly low Minkowski summation exponent of 4; see 1), it could not account for all of our results. Therefore, we abandoned the probability summation model and turned our attention (see later) to a model involving linear summation (contrast integration) over area, following spatial filtering and square-law contrast transduction.

- Retinal inhomogeneity (uneven sensitivity) across a two-dimensional array of image pixels;
- Spatial filtering (using log-Gabor filters) to produce a two-dimensional array of “response” pixels;
- Pixel-wise nonlinear transduction (squaring);
- Zero-mean Gaussian noise added to each pixel (conceptual);
- Integration across pixels (details depend on the model variant);
- Minkowski pooling (e.g., probability summation) across integrators (details depend on the model variant);
- Construction of a decision variable.

*k*), related to the standard deviation of the pooled noise sources across the image. Conceptually, the stimulus was deemed to be detected with a probability of 75% when the output equaled

*k*. In practice, rearranging the model equations and solving for contrast analytically allowed us to derive the contrast detection thresholds.

*k*was the only free parameter in the model. Owing to the contrast sensitivity function (Blakemore & Campbell, 1969), it varied with carrier frequency, allowing each pair of model curves (Figures 6a–6c) to be slid freely up and down the ordinate to achieve the best possible fits. Thus, there were 5, 4, and 1 degrees of freedom associated with this parameter for DHB, SAW, and TSM, respectively. However, this sensitivity parameter was of no interest in the study here. For example, it was irrelevant for (i) capturing the splaying shape of the threshold functions (e.g., Figures 5a–5c) and (ii) modeling the summation functions (e.g., Figures 5d–5i), which are

*relative*measures of sensitivity.

*μ*= 4) across apertures, as described in 1, consistent with contemporary interpretations of probability summation (Tyler & Chen, 2000). MATLAB code for this model can be found in the Supplementary materials.

*k*and the diameter of the integration aperture), plus the fixed parameters that set the filter bandwidths and the Minkowski pooling (

*μ*), which also influenced summation behavior (e.g., see Figure 3). Since

*k*was set separately for each carrier frequency, the model had 6, 5, and 2 free parameters for DHB, SAW, and TSM, respectively. However, since the

*k*parameters were irrelevant for summation behavior (see above), only one of the free parameters (the diameter of the integration aperture) influenced this independently for each observer.

*underestimated*the level of summation. Therefore, even if conventional second-order mechanisms were involved, this does not undermine our point that contrast integration is spatially more extensive than is often supposed. A similar defense can be made regarding any attempts to describe detection of our check stimuli in terms of contrast variance detection (Morgan, Chubb, & Solomon, 2008).

*n*is the number of units (filter responses, apertures, or any other appropriate scalar quantities) to be summed, and

*A*

_{ i }denotes the response for the

*i*th unit. In the typical formulation for probability summation with linear signal transduction,

*μ*≈ 4. If square-law signal transduction is assumed, then

*μ*≈ 8 (for our purposes here).

*only*the carrier and not the sidebands, then it would produce 6 dB (a factor 2) of summation in our experiments for the trivial reason that the relevant signal amplitude is twice as high in the full stimulus as it is for the cheese components. Toward the far left of Figure 2 (in the main body of the report), the spatial frequencies of the sidebands are half an octave either side of the carrier and differ in the carrier orientation by at least 18°. This causes them to fall outside the passband of the narrowly tuned 8-lobe filter, and summation is ∼6 dB for the trivial reason described above (gray dashed curve).