We assessed summation of contrast across eyes and area at detection threshold ( *C _{t}*). Stimuli were sine-wave gratings (2.5 c/deg) spatially modulated by cosine- and anticosine-phase raised plaids (0.5 c/deg components oriented at ±45°). When presented dichoptically the signal regions were interdigitated across eyes but produced a smooth continuous grating following their linear binocular sum. The average summation ratio (

*C*

_{ t1}/([

*C*

_{ t1+2}]) for this stimulus pair was 1.64 (4.3 dB). This was only slightly less than the binocular summation found for the same patch type presented to both eyes, and the area summation found for the two different patch types presented to the same eye. We considered 192 model architectures containing each of the following four elements in all possible orders: (i) linear summation or a MAX operator across eyes, (ii) linear summation or a MAX operator across area, (iii) linear or accelerating contrast transduction, and (iv) additive Gaussian, stochastic noise. Formal equivalences reduced this to 62 different models. The most successful four-element model was: linear summation across eyes followed by nonlinear contrast transduction, linear summation across area, and late noise. Model performance was enhanced when additional nonlinearities were placed before binocular summation and after area summation. The implications for models of probability summation and uncertainty are discussed.

*modulator*at the center of the display (unity and zero). Figure 1c is the physical sum of the stimuli in Figures 1a and 1b and is referred to as the ‘full’ stimulus. It is also the original sine-wave grating without modulation by the raised plaid. Note then, the sum of contrast over area (we refer to this as ‘contrast area’) is the same in Figures 1a and 1b, which is half that in Figure 1c.

^{2}. The shutter goggles allowed different stimuli to be presented to the two eyes by interleaving across frames with a refresh rate of 60 Hz. Target contrast was controlled by look-up tables and gamma correction was performed to ensure contrast linearity. Observers sat at a viewing distance of 96 cm with their head in a chin and headrest fixating a dark square point (4.8 arcmin) placed in the center of the display throughout the experiment. The experiments were controlled by a PC.

*f*is spatial frequency (=0.5 c/deg),

*θ*is orientation (=45°), and

*ϕ*is phase. There were two different phases of modulation: cosine phase (

*ϕ*= 0°; Figure 1a), and negative cosine phase (

*ϕ*= 180°; Figure 1b). These stimuli were given the nominal titles of ‘white’ and ‘black’ checks, respectively, as a reference to the magnitude of the modulator at the center of the display (unity and zero). Note that there are 7.07 cycles of carrier grating for every two checks (i.e., one cycle of a vertical cross-section through the envelope).

*C*= 100[(

*L*

_{max}−

*L*

_{min})/(

*L*

_{max}+

*L*

_{min})]) or in dB re 1% (=20.log

_{10}(

*C*)).

*N*) of independent estimates of summation ratios contributing to the mean is indicated in the results figures as appropriate.

Block 1 | Block 2 | ||
---|---|---|---|

Left eye | Right eye | Left eye | Right eye |

‘Eyes’ experiment | |||

Black checks | – | White checks | – |

– | White checks | – | Black checks |

Black checks | Black checks | White checks | White checks |

White checks | White checks | Black checks | Black checks |

‘Area’ experiment | |||

Black checks | – | White checks | – |

– | White checks | – | Black checks |

Full stimulus | – | Full stimulus | |

– | Full stimulus | Full stimulus | |

‘Eyes and area’ experiment | |||

Black checks | – | White checks | – |

– | White checks | – | Black checks |

Black checks | White checks | White checks | Black checks |

Black checks | White checks | White checks | Black checks |

*C*is the Michelson contrast of the carrier grating (in %). This equation has three free parameters. These are the ‘threshold’,

*α*(the target contrast at 81.6% correct when

*λ*= 0), psychometric slope (

*β*), and the lapse-error rate,

*λ*. The lapse-error rate is the proportion of trials in which the response is incorrect owing to finger errors and other observer miscues. One problem with the

*λ*parameter is that if the observer's behavior is equivalent to

*λ*= 0 (i.e., observers do not lapse), then this extra degree of freedom can lead to an oversteep estimation of

*β*. To lessen the impact of this possibility,

*λ*was capped at 0.01 in the fitting, as is appropriate for well-practiced observers (Wichmann & Hill, 2001b).

*β*= 10 (see Wichmann & Hill, 2001b). Of the 252 psychometric functions that we measured, this occurred twice.

*α*

_{single}/

*α*

_{dual}, where

*α*

_{single}and

*α*

_{dual}are the thresholds for single and dual targets, respectively. Alternatively, they are expressed in decibels thus: SR = 20log

_{10}(

*α*

_{single}/

*α*

_{dual}).

*β*). The error bars (in this and all other plots) indicate the independent estimates of 95% confidence intervals for each of these parameters (compensated for the use of small samples). For all three observers, the level of summation is quite high (mean of 4.31 dB; a factor of 1.64) and the slope of the psychometric function is quite steep (geometric mean of

*β*= 3.36; look ahead to Table 3 for details of individual results). The smaller points in Figure 3a (both black and red) are the predictions made by 62 different models, as we now describe (see 1 for implementation details).

*POOL*

_{ eye}), 2) spatial (area) pooling (

*POOL*

_{ j}), 3) contrast transduction

*f*(), and 4) additive, Gaussian noise (+

*G*). We refer to these as the ‘four-element models’ and consider models containing a cascade of nonlinear transducers later. In principle, the four elements could be arranged in any order giving 4! = 24 feed-forward architectures. We first consider two types of pooling: linear summation (Σ) and a MAX rule (akin to probability summation when this follows noisy inputs; Pelli, 1985; Tyler & Chen, 2000). There are four possible combinations of the two types of pooling across eyes and area and we consider two forms of transducer (linear and accelerating). This gives 24 × 4 × 2 = 192 different model configurations. From these we removed the redundancies (e.g., if summation is MAX across space and MAX across eyes, adjacent pairings of these operations can follow in either order), and the equivalences due to Birdsall's theorem (see Lasley & Cohn, 1981), leaving 62 formally different models to be tested (though several produced very similar predictions). For 13 of these, the contrast transducer was linear (

*f*(

*x*) =

*x*), whereas for the remaining 49 it was nonlinear (

*f*(

*x*) =

*x*

^{2.4}; Legge & Foley, 1980).

*β*′) and the summation ratio (SR′).

*β*′ predicted for the single and dual targets (the differences were always <1 for the models in Figure 3a). To avoid clutter in Figure 3a we plotted the average of these two estimates, though we return to this detail later (e.g., Figure 3b).

*β*= 1.5) and those where the slopes of the psychometric functions are quite steep (the shallowest slope is

*β*= 2.3, though

*β*≈ 3 is more typical). Within this second group, summation is too low in most cases. In fact, there are only four models for which summation is greater than the lower confidence limit of the observer with the weakest level of summation (SAW). These are shown by the slightly larger (red) points in Figure 3a, labeled A, B, C, and D (points C and D superimpose). In each of these models, the slopes of the psychometric functions are quite steep (e.g.,

*β*> 3). Only this group of four models

^{1}produced predictions consistent with the data according to the rejection criteria outlined above. The results in Figure 3a reject the other 58 models. In fact, from Figure 3a, model B is arguably marginal. However, it is the most successful model that involves the MAX operation over area, which is akin to spatial probability summation when it follows noise, as it does here (Tyler & Chen, 2000). And as models of spatial probability summation have a long history (e.g., Robson & Graham, 1981), we retain this model arrangement in our shortlist and further analyses below.

Model code ( Figure 3a): | A | B | C | D |
---|---|---|---|---|

Model architecture (order of model elements) | Σ _{eye} | Σ _{eye} | Σ _{eye} | MAX _{ j} |

x ^{2.4} | x ^{2.4} | Σ _{ j} | Σ _{eye} | |

G | G | x ^{2.4} | x ^{2.4} | |

Σ _{ j} | MAX _{ j} | G | G | |

SR′ (dB) (model) | 4.8 | 3.2 | 6.0 | 6.0 |

SR (dB) (data) | 4.3 | 4.3 | 4.3 | 4.3 |

β′ single (model) | 3.1 | 3.9 | 3.1 | 3.1 |

β′ dual (model) | 3.2 | 3.0 | 3.2 | 3.2 |

β′ average (data) | 3.4 | 3.4 | 3.4 | 3.4 |

*vs.*area) summation slope of −1 predicted by this linear model (model C). When retinal inhomogeneity is taken into account (Foley et al., 2007; Meese & Summers, 2007; Pointer & Hess, 1989) this flexes the summation curve to a concave shape that resembles the form of the data in those studies. But the summation slope remains too steep without the introduction of an accelerating (nonlinear) contrast transducer (Foley et al., 2007; Meese & Summers, 2007) or stimulus uncertainty (Pelli, 1985). Meese and Summers (2007) proposed that noise propagates from multiple sources before summing over area (cf. Campbell & Green, 1965 in the binocular domain), which also reduces the summation slope. In sum, the linear model (model C) of Table 2 is rejected on the grounds that it is inconsistent with other published results of conventional area summation experiments; it predicts too much summation, even when retinal inhomogeneity is included.

^{2}This model ( Table 2) also predicts that the slope of the psychometric function should be less for the dual target than for the single target (see also Meese & Summers, 2007 and Tyler & Chen, 2000). We found no hint of this in the experiment (not shown) but the small change in model slope (from

*β*′ = 3.9 to

*β*′ = 3.0) is possibly too small to pick up reliably against the variability in psychophysical data (see confidence limits of

*β*in Figure 3).

*f*(

*x*) =

*x*

^{ p}for a range of exponents where

*p*= 1 to 3. The loci of predictions are shown by the thick solid (model A) and dashed (model B) red curves in Figure 3b (

*p*= 1 to 3 from the bottom to the top of the curves). The data are the same as those in Figure 3a. The thin dashed gray curves show the different psychometric slopes (

*β*′) predicted by model B for the dual (short dashes) and the single (long dashes) targets.

*p,*as in Figure 3b. In fact, the curves are identical to those in Figure 3b because the first stage in both models is linear summation of contrast across eyes, which makes the two experiments equivalent for these models. Model A is clearly the superior model for all three observers. This extends the findings of Meese and Summers (2007) regarding area summation of binocular contrast to the monocular situation here.

*γ*) has a value of around 3 or 4 (Robson & Graham, 1981). However, more careful derivations for this form of summation have been developed from signal detection theory using the MAX operator (Pelli, 1985; Tyler & Chen, 2000), as in some of the models here. Nevertheless, Minkowski summation is convenient and continues to be widely used, and so for completeness we consider its application to the results here.

*resp*

_{obs}) to target contrast

*C*is given by

*resp*

_{j}is the linear response to stimulus contrast

*C*of the

*j*th of

*n*sensors in a spatial array (see 1 for details),

*p*is the exponent of the contrast transducer, and

*γ*is the Minkowski exponent. Equation 3 was solved for

*C*assuming a criterion response of unity for single and dual targets to calculate model summation ratios. The slope of the psychometric function can be derived by fitting a Weibull function to the solutions to Equation 3 for a range of criterion response levels. However, here we used the direct approximation of Pelli (1987), where

*β*= 1.247

*p*. Note that

*γ*has no effect on the slope of the psychometric function, as can easily be recognized by considering the case where

*n*= 1 in Equation 3 and

*γ*is cancelled. Equation 3 is usually (tacitly) used with

*p*= 1. This produces the set of predictions at the lower ends of the three curves in Figure 4b, which are for three example values of

*γ*(4, 2, and 1). Clearly, all are inadequate in terms of the slope of the psychometric function. Increasing the value of

*p*brings the predictions much closer to the data, the best situation shown being for

*γ*= 1 (solid red curve). In fact, that version of Equation 3 is the deterministic equivalent of model A, the very slight differences between the (solid red) curves (in Figures 4a and 4b) owing to the different approximations used in the two methods of analysis.

*γ*= 3 or 4, as in the widely used approximation to probability summation, is clearly inadequate.

*p*and the slope of the psychometric function. This slightly overestimates the level of binocular summation found in this and other experiments (e.g., Baker et al., 2007; Meese et al., 2006).

*m*) precedes binocular pooling and is then followed by spatial pooling and additive noise. When binocular pooling is a MAX operator (thick dashed blue curve) the model fails to reach the requisite levels of summation, regardless of the slope of the psychometric function, illustrating the resounding failure of that model (Meese et al., 2006). When the pooling is linear (solid blue curve), the requisite levels of summation can be achieved, but the slopes of the psychometric functions are far too shallow. In fact, the ‘eyes’ result of Figure 5 implies at least two stages of nonlinear relation between stimulus contrast and response (Baker et al., 2007). We describe and extend this idea in the next section.

*m*). This is shown in the schematic outline in Figure 6. With this arrangement it is possible to calculate the values of

*m*and

*p*that are needed to exactly fit the summation ratios found in the ‘area’ and the ‘eyes’ experiments ( Table 3, middle; see 1 for details). However, this arrangement does not produce good predictions for the slopes of the psychometric functions. For example, for TSM and SAW the cascade of fairly low exponents leads to an underestimation of

*β*. (Details are not shown, but the overall exponent is equal to the product of those in the cascade, and from Pelli (1987), this is multiplied by 1.247 to predict

*β*.) This problem was addressed by introducing a third and final exponent (

*u*) placed after the final stage of summation but before the limiting noise (Table 3 and Figure 6). Thus, to summarize,

*m*is the exponent needed to fit binocular summation,

*mp*is the exponent needed to fit area summation, and

*mpu*is the overall exponent needed to fit the (average) slope of the psychometric function. Finally, the limiting noise must be late in order that

*u*can affect performance, owing to Birdsall's theorem.

Observer | TSM | RJS | SAW | Average |
---|---|---|---|---|

Experimental results | ||||

Eyes SR | 4.95 | 5.87 | 4.26 | 5.03 |

Area SR | 5.10 | 4.83 | 5.00 | 4.98 |

Eyes and area SR | 4.40 | 4.88 | 3.64 | 4.31 |

Eyes β | 4.1 | 2.5 | 3.6 | 3.3 |

Area β | 3.8 | 2.6 | 3.6 | 3.3 |

Eyes and area β | 4.0 | 2.4 | 4.0 | 3.6 |

β ― | 3.9 | 2.5 | 3.7 | 3.3 |

P = ( β ―/1.247) | 3.2 | 2.0 | 3.0 | 2.7 |

Model: Summation across eyes then area | ||||

m | 1.22 | 1.03 | 1.41 | 1.20 |

p | 1.60 | 2.26 | 1.47 | 1.76 |

mp | 1.95 | 2.32 | 2.07 | 2.11 |

u = P/( mp) | 1.62 | 0.86 | 1.45 | 1.26 |

Equivalent uncertainty ( U) | 14 | N/A | 7 | 3 |

Predicted SR for ‘eyes and area’ | 4.27 | 4.69 | 3.70 | 4.20 |

Model: Summation across area then eyes | ||||

m | 1.95 | 2.32 | 2.08 | 2.11 |

p | 0.62 | 0.44 | 0.68 | 0.57 |

mp | 1.22 | 1.03 | 1.41 | 1.20 |

u = P/( mp) | 2.60 | 1.94 | 2.13 | 2.22 |

Equivalent uncertainty ( U) | 757 | 51 | 111 | 164 |

Predicted SR for ‘eyes and area’ | 4.95 | 5.87 | 4.26 | 5.02 |

*m, p,*and

*u*( Table 3, bottom) after fitting to the ‘eyes’ and ‘area’ experiments as before. Most notably,

*p*is compressive (<1) for this arrangement. More importantly, however, this model predicted too much summation for the ‘eyes and area’ experiment in all cases (open small [blue] symbols in Figure 7, and Table 3, bottom), causing it to be rejected in favor of the arrangement in Figure 6.

*m, p,*and

*u*; Figure 6). Even so, the overall analysis clearly favored a scheme in which binocular summation is placed before area summation, rather than the other way around ( Figure 7 and Table 3).

*m*and

*p*(to fit the high levels of summation; Figures 4 and 5) and that additive noise is placed after the transducer

*p*but before the spatial MAX operator (as in model B; Table 2). Now a high value of

*u*and subsequent late performance limiting noise (as in Figure 6) could be included to increase the slope of the psychometric function without influencing the level of summation.

*additive*noise) in which the spatial MAX operator can survive in a model of spatial summation for the type of stimuli used here.

*d*′ is in the range measured in detection experiments, an accelerating contrast transducer followed by noise can be replaced with a MAX rule that operates on noisy input lines, where the level of uncertainty,

*U,*is the number of input lines that are monitored by the observer (and the signal is carried by a single input line). The amount of uncertainty needed grows exponentially with the transducer exponent for which it is intended to replace (Pelli, 1985). In fact, there are good reasons to replace the output stage in Figure 6 (exponent

*u*and late noise) with this arrangement, as depicted in Figure 8. Because the area summation stage is linear the limiting noise can be moved to the left-hand side of that stage, placing it earlier than in Figure 6. This arrangement was an important part of the Meese and Summers (2007) model of area summation for the situation where a central patch of grating is grown in size and there is no extrinsic uncertainty. In that model, the region of linear area summation was matched to the size of the target, following retinal inhomogeneity, a nonlinear transducer, and additive noise. This predicted the moderate levels of summation found in the experiment owing partly to the growth of noise with signal area. Thus, we envisage that the area summation region in Figure 8 can be matched to the target diameter, at least up to some range (Meese & Summers, 2007; Summers & Meese, 2007). Whether this involves a flexible mechanism of variable size, or the selection of an appropriate sized pooling mechanism within a discrete set, is unclear.

*U*) needed for the model in Figure 8 were estimated using Pelli's (1985) approximation (his equation 5.4) and are reported in the middle part of Table 3. Note that they are quite modest (because

*u*was quite low), as might be expected for highly trained observers. (For completeness,

*U*is also reported in Table 3 for the less successful model in which binocular summation follows area summation. Those values are much higher.)

*POOL*

_{ eye}), 2) area pooling (

*POOL*

_{ j}), 3) contrast transduction

*f*() and 4) additive, Gaussian noise (+

*G*). We used two types of pooling: linear summation and a MAX rule, giving

*POOL*(

*x*

_{1}..

*x*

_{ n}) = Σ(

*x*

_{1}..

*x*

_{ n}) or

*POOL*(

*x*

_{1}..

*x*

_{ n}) = MAX(

*x*

_{1}..

*x*

_{ n}), respectively, where

*x*

_{ i}is the contrast response at the appropriate stage in the model (see below). For Figure 3a there were two types of transducer. For the linear transducer,

*f*(

*x*) =

*x,*and for the nonlinear transducer,

*f*(

*x*) =

*x*

^{2.4}(Legge & Foley, 1980). (Owing to the simplification below,

*x*is always ≥0.) For the model curves in Figures 3b, 4a and 5b, the exponent in this expression was varied over the range 1.0 to 3.0 in steps of 0.2.

*j*) over the region of two whole checks, one ‘black’ and one ‘white’ (

*j*= 1 to

*n,*where

*n*= 31 × 63 = 1,953, though this figure is not critical). The left-eye and right-eye linear sensor responses to unit contrast in the ‘eyes and area’ experiment are shown in Figure A1, and are given by

*L*

_{ j}and

*R*

_{ j}. The linear responses to any stimulus contrast (in %) are given by the product of these terms with target contrast

*C*.

^{3}Gaussian noise (

*G*) drawn independently on each interval of each simulated trial for each transmission line at the stage appropriate for the injection of noise for each model.

*POOL*

_{ eye}(),

*POOL*

_{ j}(),

*f*(), and +

*G*) were combined in each of the orders appropriate for the various model architectures under test. For example, the sequence:

*POOL*

_{ eye},

*f*(

*x*),

*G, POOL*

_{ j}, is given by

*C*is target contrast. Note that pooling took place across eyes and all spatial locations in both 2IFC intervals in each experiment.

*C*appeared in only one interval. The simulated observer selected the interval that produced the largest response using the appropriate model equation (e.g., Equation A1). If the interval contained the target, then the decision was correct, otherwise it was incorrect. The simulations were performed for a wide range of values of

*C*in 0.5 dB steps with 2000 trials at each level (a simulated method of constant stimuli). This was done with single and dual increments of

*C*. The simulated results were fitted with Weibull functions ( Equation 2) to produce predictions for the slopes of the psychometric functions (

*β*′) and thresholds at 81.6% correct (

*α*′). Model summation ratios (SR′) were given by: 20log

_{10}(

*α*′

_{single}/

*α*′

_{dual}).

*β*).

*C,*assuming unit response at detection threshold, to calculate summation ratios for the single and dual targets. The slope of the psychometric function (

*β*) was estimated using Pelli's (1987) approximation where

*β*= 1.247

*p,*and

*p*is the (overall) exponent of the contrast transducer.

*m, p,*and

*u*in that order. Both also involved linear summation at the two pooling stages (see Table 3 and Figure 6) and were therefore implemented deterministically, as for the Minkowski summation. For the version in which binocular summation preceded area summation ( Table 3, middle), the exponent

*m*was determined by analytic solution of the model for the empirical summation ratios (SR) in the ‘eyes’ experiment. The exponent product

*mp*was then solved numerically for the empirical SR in the ‘area’ experiment. Finally, the exponent product

*mpu*was solved analytically using Pelli's (1987) approximation:

*mpu*=

*C*for the single and dual targets in that experiment. The slopes of the model psychometric functions were determined by the fitting of

*m, p,*and

*u*(described above) and were equal to the average empirical estimate,

*m*was determined by numerical solution for the SR from the ‘area’ experiment and

*mp*was then solved analytically from the SR in the ‘eyes’ experiment.

*p*= 2.4, additive noise. As for the other models, summation could be increased by reducing the value of

*p,*but this also reduced the slope of the psychometric function. We could find no adequate transducer for this arrangement.

*Basic mechanisms*(vol. 1). Toronto, Canada: I Porteous.