**Abstract**:

**Abstract**
Contrast sensitivity improves with the area of a sine-wave grating, but why? Here we assess this phenomenon against contemporary models involving spatial summation, probability summation, uncertainty, and stochastic noise. Using a two-interval forced-choice procedure we measured contrast sensitivity for circular patches of sine-wave gratings with various diameters that were blocked or interleaved across trials to produce low and high extrinsic uncertainty, respectively. Summation curves were steep initially, becoming shallower thereafter. For the smaller stimuli, sensitivity was slightly worse for the interleaved design than for the blocked design. Neither area nor blocking affected the slope of the psychometric function. We derived model predictions for noisy mechanisms and extrinsic uncertainty that was either low or high. The contrast transducer was either linear (*c*^{1.0}) or nonlinear (*c*^{2.0}), and pooling was either linear or a MAX operation. There was either no intrinsic uncertainty, or it was fixed or proportional to stimulus size. Of these 10 canonical models, only the nonlinear transducer with linear pooling (the noisy energy model) described the main forms of the data for both experimental designs. We also show how a cross-correlator can be modified to fit our results and provide a contemporary presentation of the relation between summation and the slope of the psychometric function.

*β*, where

*β*is the Weibull slope parameter of the psychometric function (Quick, 1974; Watson, 1979; Robson & Graham, 1981). In this model,

*β*depends on the distribution of internal additive noise placed before the threshold (Sachs, Nachmias, & Robson, 1971; Quick, 1974; Tyler & Chen, 2000; Mortensen, 2002), which is sometimes assumed to be Weibull (Quick, 1974; Graham, 1989). Predictions for probability summation can be derived by setting

*γ*=

*β*in a generalization of Minkowski summation over

*m*detecting mechanisms as follows: where, for the conventional implementation,

*γ′*=

*γ*. This gives the desired summation slope of −1/

*β*(Quick, 1974). Empirical estimates of the slope of the psychometric function (

*β*) at detection threshold are

*β*≈

*γ′*≈ 4 in area summation studies, consistent with the (high-threshold) probability summation model (Robson & Graham, 1981; Meese & Williams, 2000) and the fourth-root empirical description (Meese et al., 2005). Some studies have also found

*β*≈

*γ′*for summation of superimposed components that differ in orientation and/or spatial frequency (Sachs et al., 1971; Meese & Williams, 2000), though other studies have found marked differences between

*β*and

*γ′*for related stimulus arrangements (Meinhardt, 2000; Manahilov & Simpson, 2001). The reason for these differences is not clear.

*detecting*functions” and then combines the probabilities from multiple “detectors” using conventional statistical procedures (e.g., Graham, Robson, & Nachmias, 1978). By implication, or otherwise, this approach assumes that visual

*detectors*can enter a

*state*that indicates they have correctly detected the stimulus—and that is high threshold-theory, of course. Unfortunately though, high-threshold theory has been roundly rejected. For example, contrast detection thresholds depend on guess-rate even after correction for guessing, which is inconsistent with the theory's predictions (Green & Swets, 1966; Nachmias, 1981). Nevertheless, the demise of the theoretical underpinning for Minkowski summation as an implementation of probability summation (where

*γ*=

*γ′*=

*β*) has not deterred investigators from using it as a method of combining mechanism outputs, and several defenses of this position have been made (Wilson, 1980; Nachmias, 1981; Meinhardt, 2000; Tyler & Chen, 2000; Mortensen, 2002). Indeed, models of early spatial vision tend to remain rooted in the idea that an array of independent filter-elements is followed by a nonlinear pooling strategy and a decision variable (Wilson & Bergen, 1979; Rohaly, Ahumada, & Watson, 1997; Tyler & Chen, 2000; Párraga, Troscianko, & Tolhurst, 2005). As already mentioned, this is usually interpreted as probability summation and implemented using Minkowski summation with exponents

*γ′*=

*γ*≈ 3 or 4. Nonetheless, there is no direct evidence to support the probability summation interpretation, merely the consistency between psychophysical summation data and model predictions (see also the discussion in Robson & Graham, 1981, and Mortensen, 1988). Therefore, we refer to the association between probability summation and area summation of contrast as the first dogma of spatial vision (Meese & Baker, 2011).

^{1}

*C*) at each location (

_{i}^{p}*i*), the addition of independent (Gaussian) additive noise (

*G*) at each location, linear spatial summation across filter-elements, and finally a decision variable. Note that some authors (e.g., Graham, 1989) describe such arrangements as involving

_{i}*nonlinear summation*because each signal line is subject to a nonlinear transformation of signal contrast before summing. Typically, however, we prefer the term

*linear summation*with respect to this situation, by reference to the linearity of the pooling process. This distinguishes it from the nonlinear MAX operator, which is often treated as the “minimal combination rule” (Tyler & Chen, 2000) and is involved in contemporary treatments of probability summation (Pelli, 1985) (see the section on the MAX operator). In previous work we have referred to the preceding model arrangement (Meese & Summers, 2007) as the transducer and noise combination model or more simply, the combination model (Meese, 2010). Here, we introduce the term

*noisy energy model*to describe the same arrangement (though not necessarily including retinal inhomogeneity and filtering).

*p*= 2) followed by summation and late additive noise predicts that detection thresholds decline with a (log-log) slope of −1/2 when plotted against stimulus area. This is essentially the contrast energy model (Rashbass, 1970; Manahilov & Simpson, 1999) with late noise. Similarly, when the noise is placed early (before summation) and the transducer is linear (

*p*= 1), we have the ideal summation model (Campbell & Green, 1965; Tyler & Chen, 2000), where area summation of signal and noise also causes thresholds to decline with a (log-log) slope of −1/2. This is because the total signal strength is proportional to area and the total noise is proportional to the square-root of area (the noise variances add and the standard deviation is equal to the square-root of their sum). Thus, the signal to noise ratio increases in proportion to area/√(area) and the reciprocal of this gives the relation for contrast sensitivity: a power law with an exponent of −1/2. In both cases then, the effect of area on contrast sensitivity at a fixed criterion level of performance can be described using Minkowski summation with

*γ′*= 2 (quadratic summation). But when these two effects are cascaded, as in the noisy energy model—where noise comes after contrast transduction but before summation—then the model predicts a fourth-root summation rule, effectively

*γ′*= 4 (Wilson, 1980; Meese & Summers, 2007, 2009; Meese, 2010).

*γ′*=

*pγ*in Equation 1 (see also Manahilov & Simpson, 1999). With this model arrangement,

*p*controls the slope of the psychometric function and

*pγ*controls the level of summation. They found the best model predictions with

*γ*= 1 and 2 ≤

*p*≤ 3. When

*p*= 2, this is equivalent to the noisy energy model. Meese and Summers (2009) showed that the conventional use of Minkowski summation in their model (

*γ*≈ 3 or 4 and

*p*= 1) completely failed to predict the slope of the psychometric function. Thus, while Minkowski summation might offer a pragmatic solution to the problem of combining the outputs of multiple mechanisms, if the model is to make successful predictions for a range of performance levels (i.e.,

*resp*is not a constant in Equation 1) (Bird, Henning, & Wichmann, 2002; Meese, Georgeson, & Baker, 2006; García-Pérez & Alcalá-Quintana, 2007) then the generalized version of Minkowski summation (Equation 1, where

_{overall}*γ′*=

*pγ*) should be used, at the very least.

*γ′*=

*γ*≈ 4; Tyler & Chen, 2000), consistent with area summation data.

*should*reveal the MAX operation in visual psychophysics.

*U*). Uncertainty is an inherent part of theoretical analyses involving summation and the MAX operator and can affect both the predicted levels of summation and the slope of the psychometric function (Pelli, 1985; Tyler & Chen, 2000; Neri, 2010).

*p*) when pooling is linear (Meese & Summers, 2007) or the level of uncertainty (

*U*) when pooling is a MAX operation (Tyler & Chen, 2000). Thus, most previous studies have not been able to distinguish between these two operations because a single parameter can be adjusted to produce similar contrast summation behavior by each of them. We overcame this problem here by including two further factors to help constrain the models. First, we performed the experiment using both interleaved and blocked designs for the various conditions of stimulus area. These designs have different implications for the level of extrinsic uncertainty and consequently the behavior of the models (as we describe later). Second, along with measures of contrast sensitivity, we also analyzed the slope of the psychometric function, which depends on both uncertainty and the form of the contrast transducer (Pelli, 1985; Tyler & Chen, 2000). We then derived signatures for each of our 10 canonical model configurations (2 transducers × 2 pooling rules × 2.5 forms of intrinsic uncertainty) for each of our experimental designs (blocked and interleaved). (Note that the factor of 2.5 derives from the use of three forms of uncertainty for one pooling rule but only two for the other.) A comparison of our analyses and data revealed a single model configuration that produced the correct qualitative relationships between stimulus area and both (a) sensitivity and (b) the slope of the psychometric function for the stimuli used here. The successful model involves a nonlinear contrast transducer followed by additive noise and linear pooling (i.e., the noisy energy model). We then (a) found that successful quantitative predictions could be achieved when retinal attenuation and spatial filtering were included in the model and (b) showed how a conventional cross-correlator can be modified to achieve similar results. Our analyses do not support any of the probability summation (MAX) models.

^{2}and a frame rate of 120 Hz. Look up tables were used to perform gamma correction to ensure linearity over the full range of stimulus contrasts. Observers sat in a dark room at a viewing distance of 74 cm with their head in a chin and headrest and viewed the stimuli binocularly.

*c*= 100(

*L*)/(

_{max}− L_{min}*L*)] or in dB re 1% [ = 20.log

_{max}+ L_{min}_{10}(

*c*)], where

*L*is luminance.

*β*) were estimated by fitting a Weibull function using the psignifit routine (Wichmann & Hill, 2001) where the lapse rate parameter (

*λ*) was a free parameter but constrained such that 0 ≤

*λ*≤ 0.05.

^{2}to provide a direct illustration of the effects with which we are concerned. Although both of these factors will be important for a quantitative account of our results (we consider these details when we develop the model in Part IV), there are several qualitative aspects of the models with which they do not interfere (e.g., the ordinal relation of the functions and whether there are design or area effects). These will be the focus of our attention in Part III, following the modeling here. Furthermore, at this stage we are not concerned with the luminance modulation of our stimulus over space and the modulation of responses that this would produce across linear mechanisms (see also Tyler & Chen, 2000). Although this will be picked up in Part IV and Appendix C, these details are largely irrelevant to the empirical study here where the number of sine-wave stimulus cycles was the independent variable. Thus, in the study here, the processing details for each stimulus cycle are of little importance—what matters is the rules that control pooling across multiple cycles.

*m*= 1,024 mechanisms where the number of mechanisms stimulated (

*s*) was equal to the square of the diameter of the stimulus in cycles. Our smallest and largest stimuli excited 1 and 1,024 mechanisms, respectively. A proportional scaling of these figures changed the quantitative details of some of the summation functions (e.g., see Tyler & Chen, 2000, p. 3133; Appendix C), but not their general form or our conclusions. We performed the simulations for

*t*= 6 stimulus sizes sequenced in powers of 2 (i.e., 1, 2, 4, 8, 16, and 32).

*r*) was given by the stimulus contrast (

*c*), otherwise it was zero. Every mechanism (

*i*) was subject to contrast transduction followed by independent additive Gaussian noise. The contrast transducer was either linear: or nonlinear: where an accelerating transducer exponent of

*p*= 2.0 was used to conform to the energy model. (For convenience, we sometimes refer to the linear transducer as having

*p*= 1.) The parameter

*G*was zero mean, unit variance, Gaussian noise.

*n*= 1,024. For “proportional” intrinsic uncertainty, the observer monitored additional mechanisms that were a fixed multiple of what would otherwise be monitored. For example, this could happen in a blocked area summation experiment if the observer was certain about position and area in each condition, but always uncertain about spatial frequency and orientation. In this case, as the number of relevant mechanisms increases with area, the number of irrelevant mechanisms also increases in direct proportion. In the main simulations we used a factor of 100. Proportional intrinsic uncertainty was implemented (and relevant) only for MAX pooling of the first-stage mechanisms.

Model Parameter | Explanation |

r | Mechanism response to stimulus contrast (before noise or nonlinearity) |

G | Zero mean, unit variance, additive Gaussian noise (stochastic) |

p | Exponent of nonlinear transducer (typically, p = 2.0, or p = 1.0 for the linear transducer) |

m | Number of basic contrast detecting mechanisms that are relevant to the task on at least some of the trials (m = 1,024) |

n | Number of irrelevant noisy mechanisms |

s | Number of basic contrast detecting mechanisms excited by the stimulus. This also indicates the relative areas of the stimuli. |

t | Number of different stimulus sizes (t = 6) |

i | Index into the array of m + n basic contrast detecting mechanisms |

λ | Number of linear pooling mechanisms (typically, λ = 6) |

j | Index into the array of λ linear pooling mechanisms |

s˘ _{j} | Number of basic contrast detecting mechanisms summed by the jth linear pooling mechanism |

σ _{j} | Standard deviation of the response of the jth linear pooling mechanism |

resp | Decision variable |

resp and _{j}resp′ | Intermediate stages in calculating the decision variable. |

*U*) is the ratio of the number of mechanisms that are equally excited by the stimulus to the number of mechanisms that are monitored. Thus, when there is no uncertainty,

*U*= 1. For completeness, we give explicit expressions for the levels of extrinsic (

*U*), intrinsic (

_{ext}*U*), and total (

_{int}*U*) uncertainty below and summarize these in Table 2. These offer some insight into model behaviors since it is well known that for a linear transducer, the effects of uncertainty on the psychometric function are approximately proportional to log(

_{tot}*U*) (Green & Swets, 1966; Pelli, 1985).

_{tot}^{3}However, these expressions were not used in generating our model predictions, which relied on Monte Carlo simulations and did not require explicit formulations for uncertainty.

Pooling method and experimental design | Extrinsic uncertainty | Intrinsic uncertainty | Total uncertainty |

MAX, interleaved | m/s | n + 1 | (m + n)/s |

MAX, blocked | 1 | (n + s)/s | (n + s)/s |

Σ, interleaved | K | n + 1 | K + n |

Σ, blocked | 1 | n + 1 | n + 1 |

*m*is the number of mechanisms stimulated by the largest stimulus and

*n*is the number of additional irrelevant mechanisms that control intrinsic uncertainty. In the main simulations,

*m*= 1,024 (Figure 1a). Intrinsic uncertainty was set according to

*n*= 0 (none),

*n*= 1,024 (fixed), or

*n*= 99

*m*(proportional). For these models,

*U*=

_{ext}*m*/

*s*,

*U*=

_{int}*n*+ 1 and

*U*= (

_{tot}*m*+

*n*)/

*s*, where

*s*is the number of mechanisms excited by each stimulus. Note that in this instance, the fixed (

*n*= 1,024) and proportional (

*n*= 99

*m*) intrinsic uncertainty models have the same form (

*m*was a constant); they differed only in the overall level of uncertainty (Figures 1c and e).

*s*is the number of mechanisms excited by the stimulus in the block (Figure 1b). The parameter

*n*controlled intrinsic uncertainty as follows:

*n*= 0 (none),

*n*= 1,024 (fixed), or

*n*= 99

*s*(proportional). For these models,

*U*= 1,

_{ext}*U*= (

_{int}*n*+

*s*)/

*s*and

*U*= (

_{tot}*n*+

*s*)/

*s*. Note that here, the level of total uncertainty (

*U*) did not vary with the size of the stimulus (

_{tot}*s*) when intrinsic uncertainty was proportional to

*s*. On the other hand,

*U*decreased with

_{tot}*s*when intrinsic uncertainty was fixed. This means that these two models of intrinsic uncertainty (Figures 1d and f) have distinct forms for the blocked design.

*λ*different pool sizes, evenly spaced in a logarithmic sequence. For the main simulations here,

*λ*= 6, and the pooling mechanisms were matched to the

*t*= 6 stimulus sizes. However, this was not critical: we found that setting

*λ*= 11 by adding intermediate pooling mechanisms between the

*t*= 6 stimulus sizes had a negligible effect on model behavior.

*s*) on each trial, and so a MAX operator was used to select the most responsive of the

*λ*linear pooling mechanisms. However, the expected level of response will increase with

*j*in the absence of stimulation because of the linear pooling of noise. To combat this bias, the response of each linear pooling mechanism

^{4}was normalized by the expected standard deviation (

*σ*): where

_{j}*σ*= $s\u02d8j$ (Tyler & Chen, 2000). When there was no intrinsic uncertainty (i.e.,

_{j}*n*= 0);

*resp*=

*resp*′ (Figure 1g). Otherwise, we had where

*n*= 1,024 for fixed intrinsic uncertainty (Figure 1i).

*K*, where

*K*> 1. Thus, for the models here we had

*U*≈

_{ext}*K*,

*U*=

_{int}*n*+ 1, and

*U*≈

_{tot}*K + n*. Recall that the contents of Table 2, including this approximation, were not part of our formal analysis, which used Monte Carlo simulations (see

*Monte Carlo simulations*section).

*α*; 81.6% correct) and slope of the psychometric function (

*β*) were estimated by fitting a Weibull function to the simulated data. The guess rate for the psychometric function was set to 50%, appropriate for 2IFC, giving:

*s*, equivalent to stimulus area) for the linear and nonlinear transducers (left and right columns). Within each panel, the pair of curves is for the interleaved (red) and blocked (black) designs. The intrinsic uncertainty is, from top to bottom: none, fixed, and proportional (proportional is for the MAX operator only). The dotted lines show summation slopes of −1/4 (fourth-root summation) and −1/2 (quadratic summation) for comparison.

*β ≈*1.3 (equivalent to a d′ psychometric slope of unity) across the entire stimulus range. This is to be expected from a linear transducer when there is no stimulus uncertainty (

*β ≈*1.3 is the signature of a linear system). The effect of introducing intrinsic uncertainty (Figure 4c) is to increase the slope of the psychometric function (e.g., Pelli, 1985). This has the greatest effect for the small stimulus sizes where the addition of irrelevant noisy mechanisms most seriously compromises the overall signal to noise ratio. This has a large effect in the blocked design (where previously there was no uncertainty), but little effect in the interleaved design, where the log of total uncertainty (the crucial measure) is increased only marginally by the extra mechanisms. The consequence is that the psychometric slopes for the two designs are fairly similar. When the intrinsic uncertainty is proportional to the number of mechanisms otherwise monitored (Figure 4e) the design effect and its interaction with stimulus size remains intact and the psychometric slopes are steeper overall (typically,

*β*> 3).

*β*≈ 1.3 because there is no uncertainty. It is slightly steeper in the interleaved design because of the low level of extrinsic uncertainty over the size of the pooling mechanism (Equation 6).

df | MS | RJS | TSM | |||||

Source | Error | F ratio | p | F ratio | p | F ratio | p | |

Design | 2 | 10 | 6.109 | 0.018* | 59.101 | <0.001* | 13.304 | 0.002* |

Area | 5 | 25 | 350.541 | <0.001* | 494.070 | <0.001* | 333.792 | <0.001* |

Interaction | 25 | 50 | 3.222 | 0.077 | 0.561 | 0.838 | 2.465 | 0.017* |

*β*= 3.67,

*β*= 3.51, and

*β*= 3.72 for the blocked variable, blocked fixed, and interleaved designs, respectively. By eye, there was no systematic variation of psychometric slope with stimulus size or experimental design in the average plot (Figure 6b). However, two-way ANOVA revealed a significant effect of stimulus size for RJS and TSM (see Table 4). Inspection of the data suggested that this was due to upward trends over the first parts of the functions in the interleaved condition and the blocked fixed fixation point condition for RJS and TSM, respectively. However, one-way ANOVA on each of the data sets from each design condition (i.e., nine analyses on three functions for each of three observers) found no significant effects. More importantly, however, we found no evidence for the decrease in the slope of the psychometric function with stimulus area that was predicted by several of the MAX models (see Figure 4).

df | MS | RJS | TSM | |||||

Source | Error | F ratio | p | F ratio | p | F ratio | p | |

Design | 2 | 10 | 1.522 | 0.265 | 1.303 | 0.314 | 0.959 | 0.416 |

Area | 5 | 25 | 0.497 | 0.775 | 3.259 | 0.021* | 2.989 | 0.030* |

Interaction | 25 | 50 | 1.029 | 0.433 | 0.812 | 0.619 | 0.492 | 0.888 |

*β*), which should be around 3 or 4. Further simulations (e.g., see Part IV) confirmed that spatial filtering and retinal inhomogeneity had little effect on the slopes of the psychometric functions but caused the summation functions to bow in a similar way to the experimental data. For simplicity, our toy models did not include these processes and so they are not expected to predict the bowing of the summation slopes. Therefore, we overlook mismatches in the third data column of Table 5 for now. For similar reasons, we do not consider whether the models produce an interaction between area and design on the summation slopes. However, all other gross qualitative mismatches between model and data lead to model rejection and are represented by an X in Table 5. Following this procedure all but one of the models was rejected by our data, though several entries are worthy of further consideration.

Contrast sensitivity | Slope of the psychometric function | Reject | ||||||

Area effect | Design effect | Sum. slope | Area effect | Design effect | Interleaved ∼β | Blocked ∼β | ||

Human result | Yes | Yes | Mid | No | No | 3 → 4 | 3 → 4 | — |

MAX, LT | Yes | Yes | Mid | Yes | Yes | 3.8 → 1.3 | 1.3 | Yes |

No IU | X | X | X | X | 4X | |||

MAX, LT | Yes | Barely | Mid | Yes | Barely | 4 → 1.5 | 3.8 → 1.5 | Yes |

Fixed IU | X | X | X | 3X | ||||

MAX, LT | Yes | Yes | Low | Small | Small | >4 | ∼3 | Yes |

Proportional IU | X | X | X | 3X | ||||

Σ, LT | Yes | Yes | High | No | Small | 1.7 | 1.3 | Yes |

No IU | X | X | 2X | |||||

Σ, LT | Yes | No | High | No | No | 3.8 | 3.8 | Yes |

Fixed IU | X | 1X | ||||||

MAX, NT | Yes | Yes | Low | Yes | Yes | 8 → 2.6 | 2.6 | Yes |

No IU | X | X | X | 3X | ||||

MAX, NT | Yes | No | Low | Yes | Barely | 8 → 3.0 | 8 → 3 | Yes |

Fixed IU | X | X | X | X | 4X | |||

MAX, NT | Barely | Barely | Very low | Yes | Yes | 10 → 7 | ∼6 or 7 | Yes |

Proportional IU | X | X | X | X | 4X | |||

Σ, NT | Yes | Yes | Mid | No | Small | 3.4 | 2.6 | No |

No IU | ||||||||

Σ, NT | Yes | No | Mid | No | No | 7 | 7 | Yes |

Fixed IU | X | X | X | 3X |

*m*). However, the utter failure of this model to predict the slope of the psychometric function (Figure 4a) cannot be remedied.

*n*). However, this further decreases the design effect on contrast sensitivity (Figure 2c), which was significant for all three observers in the experiment. The design effect can be reintroduced by allowing the number of irrelevant mechanisms to vary with stimulus size (i.e., involving proportional uncertainty, Figure 2e). This also has the benefit of increasing the slope of the psychometric function in the blocked design (Figure 4e) to something close to those in the human data (Figure 6b), but a design effect remains for the psychometric slope (Figure 4e) and is inconsistent with the results. Although rejected, this model is arguably the most successful of the MAX models (on a qualitative basis) and we revisit it again in Part IV along with the more conventional fixed uncertainty MAX model.

*β*< 3.4 (Figure 5c). This is all broadly consistent with the experimental results, though the empirical psychometric slopes are arguably a little high (average

*β*= 3.6). This might be due to low levels of intrinsic uncertainty in the experiment that were not a part of this model. Instead, or as well as, it might be due to the small overestimation of the slope of the psychometric functions that is an inherent consequence of undersampling in typical psychophysical methods such as those used here (Wichmann & Hill, 2001; Wallis et al., in press).

*c*). This front end of the model was used with several of the model variants described above.

*this*suboptimal way is not clear. However, to provide the benefit of the doubt we reran the model with the noisy mechanisms weighted by a template constructed from the expected stimulus following spatial filtering and the witch's hat attenuation surface from Figure B1 in Appendix B. This weighting was also applied to the multiple “layers” of irrelevant noisy mechanisms. This down-weighted the more peripheral mechanisms, relevant and irrelevant alike. For the interleaved design, the template was that for the largest stimulus. The results are shown in Figure 7i and j. This strategy remedied the “upturn” problem described above but did nothing to improve the overall fit of the model. As we commented in the section

*Near(ish) misses*, this weighting strategy (in conjunction with the MAX operator) is in stark contradiction to the human results since it predicts little or no benefit from stimulus area beyond a diameter of four stimulus cycles.

*U*alone is insufficient for the modeler; the number of mechanisms excited by the stimulus (

*s*) must also be estimated. In Appendix C we develop the MAX model through several iterations of detail, and report on the effects of the absolute number of mechanisms. We are led to conclude that this consideration is unlikely to improve the fortune of the MAX model.

*p*= 1) this model also failed badly (black curve in Figure 8a). To try to remedy this problem we set the transducer to a square-law (

*p*= 2) and also built the square law transduction into the template. This caused the model to underestimate the levels of summation in the human data (blue curve, Figure 8b). However, when our spatial filters were returned to the model and their effects were built into the expected template,

^{5}the predicted levels of summation became very much like the noisy energy model (green and black curves in Figure 8b). The increase in summation slope arises because the spatial filtering blurs the stimulus around its boundary, thereby reducing its energy. Because this effect is most severe for the smallest stimulus (where the boundary to area ratio is highest) this increases the initial part of the summation slope. A related factor is that the footprint of the filter-element (its receptive field) is larger than the smallest stimulus. This means that the observer benefits from linear summation within that filter-element with the initial increase in stimulus area. This explanation applies to the template-matching model, the noisy energy model, and the MAX models in Figure 7 (see Appendix C and Meese [2010] for further comment).

*upwards*, as negligible signal is added at the cost of recruiting further noise. This suboptimal strategy does not happen with the template-matching model (green curve), which asymptotes with larger stimuli (not shown).

*β*= 3 to 4. We developed 10 canonical models of the summation process involving linear and nonlinear transducers, various forms of uncertainty, and linear sum and MAX pooling operations. Of these, only the noisy energy model made the correct qualitative predictions. When this was extended to include spatial filtering and retinal inhomogeneity, it produced fairly good quantitative predictions of our results. We found no variant of the MAX model (a contemporary implementation of probability summation) that was able to account for all of the results. A cross-correlator (matched template) model also produced good predictions, but only when it included the same key features as the final version of the noisy energy model: retinal inhomogeneity, spatial filtering, square-law contrast transduction, and integration of signal and noise over stimulus area. In short, the template is not a substitute for spatial filtering but comes after it.

*within*filter-elements over short distances. So why has it done so well for the smaller stimuli here? When the filters are removed from our successful models, the initial slope is shallow—approximately a fourth-root slope owing to the cascade of square-law transduction and integration of noise (e.g., the blue curve in Figure 8b). However, when the filters are returned, the slope is steepened owing to the effects of linear summation within the filter-elements (compare the blue and green curves in Figure 8b) and the result happens to be close to a quadratic summation rule (slope = −1/2). Thus, our contention is that just as the combined effects of transduction and noise can masquerade as probability summation over larger distances, when these processes are combined with short-range linear summation

*within*filters, they can masquerade as a stimulus energy metric over shorter distances. The filter-based models that we propose predict both of these deceptions.

*U*≈ 3).

_{int}*signal combination*) extends over a couple of stimulus cycles, at best. The psychophysical work here, and elsewhere (Kersten, 1984; Mayer & Tyler, 1986; Manahilov et al., 2001; Foley et al., 2007; Meese & Hess, 2007; Meese & Summers, 2009; Meese, 2010; To, Baddeley, Troscianko, & Tolhurst, 2010; Meese & Baker, 2011) suggests that signal combination is spatially more extensive than this.

*Journal of Vision*, 11(14):14, 1–16, http://www.journalofvision.org/content/11/14/14, doi:10.1167/11.14.14. [PubMed] [Article] [CrossRef]

*Journal of Vision*, in press.

*Journal of the Optical Society of America A*

*,*19(7), 1267–1273. [CrossRef]

*Vision Research*

*,*39, 2597–2602. [CrossRef] [PubMed]

*Journal of the Optical Society of America A*

*,*1(8), 906–910. [CrossRef]

*Journal of Neurophysiology*

*,*98, 1733–1750. [CrossRef]

*Nature*

*,*208, 191–192. [CrossRef] [PubMed]

*Journal of Neuroscience*

*,*27, 9638–9647. [CrossRef] [PubMed]

*Vision Research*

*,*47, 85–107. [CrossRef] [PubMed]

*Spatial Vision*

*,*3, 129–142. [CrossRef] [PubMed]

*Spatial Vision*

*,*20, 5–43. [CrossRef] [PubMed]

*Vision Research*

*,*46, 4294–4303. [CrossRef]

*Vision Research*

*,*48, 2321–2328. [CrossRef] [PubMed]

*Vision Research*

*,*18, 815–825. [CrossRef] [PubMed]

*Vision Research*

*,*38, 231–257. [CrossRef] [PubMed]

*Visual pattern analysers*. New York: Oxford University Press.

*Signal detection theory and psychophysics*. New York: Robert E. Krieger Publishing Company.

*Journal of Neuroscience*

*,*17, 2914–2920. [PubMed]

*Vision Research*

*,*18, 369–374. [CrossRef] [PubMed]

*Vision Research*

*,*24, 1977–1990. [CrossRef] [PubMed]

*Behavioural and Brain Sciences*

*,*11, 275–339. [CrossRef]

*Journal of the Optical Society of America*

*,*70, 1458–1471. [CrossRef] [PubMed]

*Biological Cybernetics*

*,*81, 61–71. [CrossRef] [PubMed]

*Vision Research*

*,*41, 1447–1560. [CrossRef]

*Journal of the Optical Society of America. A, Optics, Image Science, and Vision*

*,*18, 273–282. [CrossRef] [PubMed]

*Journal of the Optical Society of America A*

*,*3, 1166–1172. [CrossRef]

*Journal of Vision*, 10(8):14, 1–21, http://www.journalofvision.org/content/10/8/14, doi:10.1167/10.8.14. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Vision*, 11(1):23, 1–23, http://www.journalofvision.org/content/11/1/23, doi:10.1167/11.1.23. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Vision*, 11(14):14, 1–16, http://www.journalofvision.org/content/11/14/14, doi:10.1167/11.14.14. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Vision*, 6(11):7, 1224–1243, http://www.journalofvision.org/content/6/11/7, doi:10.1167/6.11.7. [PubMed] [Article] [CrossRef]

*Vision Research*

*,*47, 1880–1892. [CrossRef] [PubMed]

*Perception*

*,*30, 1411–1422. [CrossRef] [PubMed]

*Journal of Vision*, 5(11):2, 928–947, http://www.journalofvision.org/content/5/11/2, doi:10.1167/5.11.2. [PubMed] [Article] [CrossRef]

*Proceedings of the Royal Society B*

*,*274, 2891–2900. [CrossRef] [PubMed]

*Journal of Vision*, 9(4):7, 1–16, http://www.journalofvision.org/content/9/4/7, doi:10.1167/9.4.7. [PubMed] [Article] [CrossRef] [PubMed]

*Vision Research*

*,*40, 2101–2113. [CrossRef] [PubMed]

*Biological Cybernetics*

*,*82, 269–282. [CrossRef] [PubMed]

*Biological Cybernetics*

*,*59(2), 137–147. doi:10.1007/BF00317776. [CrossRef] [PubMed]

*Vision Research*

*,*42, 2371–2393. [CrossRef] [PubMed]

*Journal of Vision*, 11(5):2, 1–25, http://www.journalofvision.org/content/11/5/2, doi:10.1167/11.5.2. [PubMed] [Article] [CrossRef] [PubMed]

*Vision Research*

*,*21, 215–223. [CrossRef] [PubMed]

*Vision Research*

*,*42, 41–48. [CrossRef] [PubMed]

*Vision Research*

*,*14, 1039–1041. [CrossRef] [PubMed]

*Frontiers in Computational Neuroscience*

*,*4(151), 1–17. [PubMed]

*Vision Research*

*,*45, 3145–3168. [CrossRef] [PubMed]

*Journal of the Optical Society of America A*

*,*2, 1508–1532. [CrossRef]

*Kybernetik*

*,*16, 65–67. [CrossRef] [PubMed]

*Journal of Physiology*

*,*210, 165–186. [CrossRef] [PubMed]

*Nature Neuroscience*

*,*2, 1019–1025. [CrossRef] [PubMed]

*Current Opinion in Neurobiology*

*,*12, 162–168. [CrossRef] [PubMed]

*Vision Research*

*,*21, 409–418. [CrossRef] [PubMed]

*Vision Research*

*,*37, 3225–3235. [CrossRef] [PubMed]

*Vision Research*

*,*33, 2773–2788. [CrossRef] [PubMed]

*Vision Research*

*,*34, 1301–1314. [CrossRef] [PubMed]

*Journal of the Optical Society of America*

*,*61, 1176–1186. [CrossRef] [PubMed]

*IEEE Transactions on Pattern Analysis and Machine Intelligence*

*,*20, 1–17.

*Vision Research*

*,*45, 2009–2024. [CrossRef] [PubMed]

*Vision Research*

*,*49, 1894–1900. [CrossRef] [PubMed]

*Journal of Vision*, 6(4):8, 387–413, http://www.journalofvision.org/content/6/4/8, doi:10.1167/6.4.8. [PubMed] [Article] [CrossRef] [PubMed]

*Experimental Brain Research*

*,*41, 414–419. [PubMed]

*Vision Research*

*,*23, 907–910. [CrossRef] [PubMed]

*Proceedings of the Royal Society B*

*,*278, 1365–1372. [CrossRef] [PubMed]

*Vision Research*

*,*40, 3121–3144. [CrossRef] [PubMed]

*Vision Research*, in press.

*Vision Research*

*,*19, 515–522. [CrossRef] [PubMed]

*Journal of Vision*, 5(9):6, 717–740, http://www.journalofvision.org/content/5/9/6, doi:10.1167/5.9.6. [PubMed] [Article] [CrossRef]

*Nature*

*,*302, 419–422. [CrossRef] [PubMed]

*Perception and Psychophysics*

*,*63, 1293–1313. [CrossRef] [PubMed]

*Biological Cybernetics*

*,*38, 171–178. [CrossRef] [PubMed]

*Vision Research*, 19, 19–32. [CrossRef] [PubMed]

^{3}From Pelli (1985), Weibull β increases very nearly linearly over five orders of magnitude of log

*U*. The threshold function is slightly more compressive over a similar range. However, the log of threshold is markedly compressive when plotted against the log of

*U*. In other words, once uncertainty is very high, enormous amounts of extra uncertainty are required for it to influence log sensitivity appreciably.

^{4}The ideal strategy is to search for the maximum difference of each of the six normalized pooling mechanisms across the 2IFC interval. However, it seems unlikely that the observer would retain all of the necessary information from the first interval, and so we opted for applying the MAX operator across the six normalized responses within each interval. For the conditions considered here these two strategies produced negligible differences (not shown).

^{5}We built the effects of filtering and nonlinear transduction into the expected template because for an observer with this front-end, that is the ideal strategy. However, further simulations in which the template was that of the stimulus without filtering and transduction showed that this detail was not important for achieving the good model prediction shown here.

*β*) and the Minkowski exponent,

*γ*, needed to produce the level of summation predicted by each of our model variants. The analysis here involves assessing the level of summation that is predicted when the number of equally sensitive mechanisms is doubled at various points along each of the model functions, where Minkowski summation is given by:

*thresh*(

*x*) =

*Ax*

^{3}+

*Bx*

^{2}+

*Cx*+ D (in decibels), and each of the curves describing the slope of the psychometric function in Figure 4 with a quadratic equation,

*psychSlope*(

*x*) =

*Ex*

^{2}+

*Fx*+ G. In each case,

*x*= log

_{2}(

*i*) where

*i*is the number of equally excited mechanisms.

*γ*in Equation A1) was estimated from the summation curves for integer increments of

*x*as follows:

*β*). In the classical analysis of probability summation, the slope of the psychometric function is treated as an estimate of the Minkowski exponent (Quick, 1974; Robson & Graham, 1981). In this view,

*γ*=

*β*, as indicated by the diagonal dotted lines in Figure A1. Our analysis shows that this equivalence is never actually met. If intended as an approximation to probability summation, then the Minkowski exponent (

*γ*) should always be set higher than the slope of the psychometric function (

*β*). In some cases the difference is marginal, but in others it is substantial. But choosing an appropriate value is likely to be difficult. In some cases the functions are almost vertical (e.g., see the black curves in panels a, b, e, and f) meaning that very small changes in the estimate of the slope of the psychometric function lead to large changes in the Minkowski exponent. Furthermore, the functions are different for the blocked and interleaved designs because of the different effects of uncertainty. More troublesome still, the relationship between

*γ*and

*β*depends on the nature of the intrinsic uncertainty, a parameter over which the experimenter has little or no control.

*p*. For the blocked condition, the slope of the psychometric function is given by ∼1.3

*p*in the absence of uncertainty. For the interleaved condition, it is a little higher. In all cases, uncertainty increases the slope of the psychometric function but leaves the Minkowski exponent untouched.

*r*in the relevant equation from the section on the toy models in the main body of the report. The signal region was defined by the half-height of its envelope (i.e., it has a diameter of (12 × cycles + 2) pixels, where

_{i}*cycles*is the number of carrier cycles in the plateau region of the full stimulus). For the stimuli used here, this was a convenient if arbitrary solution to the problem of defining the summation region. However, Figure B2 shows that the model predictions are not critically dependent on this parameter. For example, other reasonable choices such as summing over only the central plateau (green dashed curve) or the entire stimulus (green dotted curve) produce negligible changes to the predictions.

*How to get template-matching to work*in the main body of the report) one might expect that it would be picked up by other filters in the human brain, not included in the model. However, including the responses of such filters in the template will not fully compensate the loss because (a) extra noise will also be recruited from each extra filter-element and (b) when the transducer is nonlinear (

*p*> 1), the impact of stimulus energy is diminished when it is spread across multiple filter-elements. For simplicity, we chose to not include additional filters here.

*p*to be a free parameter. However, the effects of varying this parameter are shown in Figure B3. Clearly,

*p*= 2 (black curve) is close to optimal, though the upturn to the right of the functions, readily seen where

*p*> 2, could be remedied if the signal and noise were weighted by the retinal attenuation surface (Figure B1) before summation (see the section

*How to get template-matching to work*in the main body of the report).

*Adding a front end to the models*in the main body of the current report), this had the mere effect of slightly reducing the predicted level of summation from that seen in the earlier toy model (i.e., the black curve has a slightly shallower slope in Figure C1a than it does in Figure 2a). In other simulations (not shown) we set the number of mechanisms to four for the smallest stimulus, consistent with a sampling regimen of two samples per cycle as requested by a reviewer. This produced a summation slope intermediate to the other two but had little other influence.

*within*the filter-elements; see also Meese [2010]) for both designs (Figure C1k) but had little or no effect on the slopes of the psychometric functions (Figure C1l).

*i*and length (number of first-stage mechanisms)

*s*. We refer to the filtered stimulus as

*stim*, the witch's hat as

*witch*, and stimulus contrast as

*c*. The signal to noise ratio (

*SNR*) in the target interval is given by:

*c*term is squared because of the square-law nonlinear transduction. The

*witch*and

_{i}*stim*terms on the numerator are squared once by the nonlinear transduction, then again because the expected signal is multiplied by an exact template of itself (i.e., the template is also subject to filtering, transduction, and the witch's hat). These terms are then summed linearly, as for a cross-correlator. Note that for the stimuli used here, the

_{i}*witch*term is responsible for the concave bowing of the summation function (e.g., Figure 8b), whereas the

*stim*term has no effect on the form of the summation function other than it carries the effects of spatial filtering, discussed in the section

*How to get template-matching to work*in the main body of the report.

*witch*and

_{i}*stim*terms are squared once on the denominator because the template is subject to square-law transduction and is used to weight the noise terms at each location

_{i}*i*. These squared terms are standard deviations and must be squared again to give the local variances, which are summed, and the square root delivers the standard deviation of the overall noise term. Thus, the

*SNR*is the ratio of the weighted linear sum of the signals squared (owing to nonlinear transduction) and the weighted linear sum of the noise sources.

*c*

^{2}term and the square root influence of the accumulated noise. The fourth-power terms in Equation D1 are irrelevant to its fourth-root summation behavior!

*template*is the stimulus following attenuation by the witch's hat and spatial filtering.

*x,y*(1 ≤

*x*,

*y*≤ 512) are indices into a two-dimensional (512 × 512) pixel array,

*L*

_{0}is mean luminance,

*c*is the Michelson contrast of the carrier (and stimulus),

*cycpix*is the number of pixels per cycle (=12), and

*win*is an envelope function, defined as follows: where

_{xy}*platpix*is the width of the central plateau of the envelope and was equal to

*cycpix*times the number of nominal stimulus cycles (i.e., times the

*x*-axis in the data figures) and

*skirtpix*is the width of the blur skirt around the plateau and was equal to 2.