Open Access
Article  |   July 2023
Suprathreshold length summation
Author Affiliations
  • Chien-Chung Chen
    Department of Psychology, National Taiwan University, Taipei, Taiwan
    Neurobiology and Cognitive Science Center, National Taiwan University, Taipei, Taiwan
    c3chen@ntu.edu.tw
  • Chia-Hua Chien
    Department of Psychology, National Taiwan University, Taipei, Taiwan
    angel18215@gmail.com
  • Christopher W. Tyler
    Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA
    Division of Optometry, School of Health Sciences, City University of London, London, UK
    cwt@ski.org
Journal of Vision July 2023, Vol.23, 17. doi:https://doi.org/10.1167/jov.23.7.17
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Chien-Chung Chen, Chia-Hua Chien, Christopher W. Tyler; Suprathreshold length summation. Journal of Vision 2023;23(7):17. https://doi.org/10.1167/jov.23.7.17.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

To investigate the mechanisms underlying elongated spatial summation with a pattern-masking paradigm, we measured the contrast detection thresholds for elongated Gabor targets situated at 3° eccentricity to either the left or right of the fixation and elongated along an arc of the same radius to access homogeneous retinal sensitivity. The mask was a ring with a Gabor envelope of the same 3° center radius containing either a concentric (iso-orientation mask) or a radial (orthogonal mask) modulation. The task of the observer was to indicate whether the target in each trial was on the left or the right of the fixation. With orthogonal or low contrast iso-orientation masks, target thresholds first decreased with size with slope −1 on log-log coordinates until the target length reached 45′ (specified as the half-height full-width of the Gabor envelope) and then further decreased according to a slope of −1/2, the latter being the signature of an ideal summation process. When the contrast of the iso-orientation mask was sufficiently high, however, the target thresholds, while still showing a −1 slope up to ∼10′, asymptoted up to about 50′ length, suggesting that the presence of the mask eliminated the ideal summation regime. Beyond about 50′, the data approximated another −1 slope decrease in threshold, suggesting the existence of an extra-long channel that is not revealed by the conventional spatial summation paradigm. The full results could be explained by a divisive inhibition model, in which second-order filters sum responses across local oriented channels, combined with a single extra-long filter at least 300’ in extent. In this model, the local filter response is given by the linear excitation of the local channels raised to a power, and scaled by divisive inhibition from all channels in the neighborhood. With the high-contrast iso-orientation masks, such divisive inhibition swamps the response to eliminate the ideal summation regime until the stimulus is long enough to activate the extra-long filter.

Introduction
In everyday life, we often need to detect the presence of some discoloration against the background of patterned objects. Examples are detection of stains on patterned clothes, cracks in decorated china, a distressed swimmer in the ocean, a hiker in the undergrowth, and so on. These are tasks of object detection against a patterned background, and many such objects are elongated, as are their boundaries. Moreover, deriving the elongated contours of object boundaries is of particular relevance in understanding the overlapping structure of objects in cluttered scenes (Marr, 1982). 
In this paper, we consider two factors that can affect the detectability of an elongated visual target: the elongation of the target itself and the context in which the target is presented. For the former, in general, the detection threshold of a small visual target decreases with its size (Barlow, 1958; Baumgardt, 1959; Tanner & Jones, 1960; Watson, Barlow, & Robson, 1983; Polat & Tyler, 1999; Tyler & Chen, 2000; Tyler & Chen, 2006; Kao & Chen, 2012; Meese & Summers, 2012; Kingdom, Baldwin, & Schmidtmann, 2015; Chen, Yeh, & Tyler, 2019). This effect, termed “spatial summation” in the literature, manifests in various different forms. When the target is very small, the detection threshold varies in inverse proportion to target size. This type of the summation is designated as Ricco's law (Barlow, 1958; Baumgardt, 1959) or complete summation. The common interpretation of such complete summation in terms of neural mechanisms is that, if a target is small enough to fit into the receptive field of the target detector, the increment in target size would result in a greater overlap between the target and the receptive field. This overlap in turn would produce a greater response in the target detector. Assuming that the noise level within the target detector is independent of target size and thus constant throughout the experiment and that the transducer function of the detector is linear, the signal-to-noise ratio experienced by the visual system should therefore increase linearly with target size. Thus, plotting on log–log coordinates, the threshold should decrease with target size with a slope of -1, as is commonly found (Barlow, 1958; Baumgardt, 1959; Watson et al., 1983; Polat & Tyler, 1999; Tyler & Chen, 2006; Kao & Chen, 2012). 
Another type of spatial summation, designated as Piper's law, has the target detection threshold decreasing with the square root of target size. This behavior is interpretable in terms of an ideal observer who uses a single matched filter to detect each target (Wiener, 1949) and is thus also termed ideal summation (Tanner & Jones, 1960; Tyler & Chen, 2000).1 This optimal strategy can be implemented as a second-order filter that receives inputs from a set of discrete local sensors with independent and identically distributed noise sources (Chen & Tyler, 2001; Chen et al., 2019). The number of sensors that feed into the matched filter is proportional to the size of its receptive field (Green & Swets, 1966), and the response of the matched filter to the target is proportional to the sum of the responses to the individual samples. Thus, the mean and the variance of the matched filter response increase in proportion to target size. Because the detectability (or d’ in the context of the signal detection theory) of a target depends on the ratio of the mean to the standard deviation of the response (Green & Swets, 1966), the threshold should thus decrease with the square root of target size, corresponding to a −1/2 log–log slope summation function. Some formulations thus simply use the square root of the number of local sensors as the dominator of a d’ calculation (e.g., Kingdom et al., 2015). 
There is also a type of summation that can be characterized as the detection threshold being determined by the channel with the highest signal-to-noise ratio (SNR) (Tyler & Chen, 2000; Meese & Summers, 2012). Rather than summing their outputs linearly, this “probability summation” behavior (Pelli, 1985; Meese & Summers, 2012; Kingdom et al., 2015) is governed by a selection, or attention, mechanism that identifies the filter, or channel, with the maximum SNR across the array of stimulated local channels while ignoring all unstimulated channels. We therefore refer to this behavior as ideal attentional summation (Tyler & Chen, 2000), because the attentional mechanism monitors only the channels relevant for a given stimulus size. Because the detectability is determined by the channel with the highest SNR, the probability of target detection is then governed by the maximum order statistics of the responses across relevant channels, for which the mean increases and standard deviation decreases with the number of monitored channels (Pelli, 1985; Chen & Tyler, 1999). The SNR in this scenario can be shown by computational simulation to increase initially with the quarter power of target size, with a progressively decreasing power for larger numbers of channels (Tyler & Chen, 2000, Figure 8). 
A similar form of summation also occurs when the observer detects a target with a fixed attention window—that is, when the observer monitors a fixed number of channels, and among them only a proportion of these channels responds to the target. The increase of target size would also increase the number of the responsive channels and in turn reduce the ratio between the number of the monitored and the responsive channels, or uncertainty. Such uncertainty reduction approximates a −1/4 slope summation curve until the number of stimulated channels approaches the total number available within the attention window (Tyler & Chen, 2000, Figure 10). 
Notice that the above theories of spatial summation all assume a linear relationship between target contrast and internal detector response. This assumption may be a good first-order approximation when all of the measurements are performed around the detection threshold. However, for a wider stimulus range, it is known from the early era of psychophysics (Fechner, 1860) that human visual performance is mediated by nonlinear response functions. One purpose of the present study was thus to investigate how spatial summation could be affected by such nonlinearities. We utilized the well-established contrast-masking paradigm, in which the task of the observer is to detect a visual target superimposed on a mask (Stiles, 1959; Legge & Foley, 1980; Foley, 1994). A typical result of the masking paradigm is that the target detection threshold first decreases (facilitation) and then increases (masking) with mask contrast, resulting in a dipper-shape target threshold versus the mask contrast (TvC) function (Legge & Foley, 1980; Chen & Tyler, 2001). It is suggested that such a dipper shape function reflects the nonlinearity of the internal response function—that is, relative to the response without a mask, if the mask intensity or contrast produces a response in an accelerating phase of the response function, the system requires less extra contrast to increase the response to a level that overcomes the limitation of noise. Thus, the target threshold at this mask contrast level will be lower than the threshold with no mask. On the other hand, if the mask contrast is in a decelerating phase of the response function, the visual system then requires a greater contrast to overcome the noise compared with the response without a mask, and the target threshold at this mask contrast level will be greater than the threshold with no mask. Thus, under the assumption that the limiting noise in the system is independent of the stimuli, by measuring the target threshold with a systematically varying mask contrast, it is possible to estimate the nonlinearity of the internal response function. 
In the present study, we measured the contrast threshold for pattern targets varying in length and superimposed on pattern masks of various contrast levels. Threshold measurements for targets of different lengths provided the spatial summation curves. Use of different levels of mask contrast provided information regarding the shape of the internal response function. We used radial Gabor masks with carrier orientations that were either parallel or orthogonal to the target carrier orientation, for it is known that these two types of mask produce very different effects on the internal response function (Foley, 1994), with the orthogonal masks having the weaker effect. 
Previously, Meese and colleagues also measured the TvC functions of visual targets in different sizes and mask contrasts (Meese, 2004; Meese & Summers, 2007). However, their studies involved only two target sizes and thus did not provide a full description of the spatial summation curves or their changes as a function of contrast. We thus overcame the limitations of previous studies by using a sufficient number of levels of both target length and mask contrast for a quantitative analysis of the summation behavior for suprathreshold contrast processing. 
Methods
The experimental methods were similar to those in Chen et al. (2019) except for the spatial properties of the masks and the participants. 
Apparatus
All stimuli were presented on two ViewSonic pf75+ 15-inch monitors (ViewSonic Corporation, Brea, CA) controlled by an ATI Radeon 7200 video card (ATI, Inc., Dallas, TX) on a MacPro computer (Apple, Inc., Cupertino, CA). The video card provided a10-bit digital-to-analog converter depth that allowed accurate representation of visual stimuli at low contrasts. The mask was presented on one monitor and the target on the other. Light from the two monitors was combined by a beam splitter so that the observer experienced the stimuli from the two monitors superposed. This arrangement provided the advantage of independent control of the contrast of the target and the mask. The viewing field was 13.9° (horizontal) by 10.4° (vertical) with a resolution of 1024 horizontal by 768 vertical pixels, giving 48 pixels per degree at the 135-cm viewing distance. The refresh rate of the monitors was 85 Hz, with a mean luminance of 48.5 cd/m2. The luminance of the monitor at each output setting was measured with a Photo Research PR655 spectroradiometer (Photo Research, Inc., Chatsworth, CA), and the values were used to construct a linearized lookup table. 
Stimuli
As in Chen et al. (2019), the target (Figure 1) was an elongated Gabor patch arced along the circumference of an invisible circle of 3° radius centered at the fixation (o in Figure 1A). That is, the target was defined by  
\begin{eqnarray}&& G(r,\theta ;\,{\sigma _\theta },{c_t}) = L + L \times c \times \cos \,(2\pi fr) \nonumber \\ && \times \exp \left( { - \frac{{{{(r - u)}^2}}}{{2\sigma _r^2}}} \right) \times \exp \left( { - \frac{{{\theta ^2}}}{{2\sigma _\theta ^2}}} \right) \quad \end{eqnarray}
(1)
where r and θ are the radius and the planar angle of a pixel in polar coordinates (as noted in Figure 1B); L is the mean luminance of the display; c is the contrast; f is the spatial frequency of 2.5 c/°; σr and σθ are the scale parameters (standard deviations) of the Gaussian envelope along the radius and circumference, respectively; and the center of the Gabor ring, u, was at −3° or +3° eccentricity for patterns presented on the left or right of screen, respectively. The parameters σr and σθ controlled the size of the stimuli. The value of σr was fixed at 0.14° subtense, and σθ varied from 0.5° to 40° along the circumference. This arrangement allowed the stimuli to remain at the same cortical magnification factor regardless of their length. For better comparison with results of the previous studies, from here on we specify the length of the targets by their half-height full-width (denoted HHFW in Figure 1A) of the Gaussian envelope, which is 2 × [−ln(0.5) × 2]^0.5 × u × σθ = 7.06 × σθ, where σθ is its width parameter in radians and HHFW is in degrees of visual angle. 
Figure 1.
 
(A) Diagram for the apparatus setup showing the relative position of the components. The distances are not to the scale. (B). The targets were Gabor arcs, centered at the fixation point, f, of different lengths defined in polar coordinates with radius r and angle q. For ease of comparison with the literature, we report the target size in half-height full width (HHFW). The target was embedded in either a parallel (C) or an orthogonal (E) circular mask, as shown in D and F, respectively.
Figure 1.
 
(A) Diagram for the apparatus setup showing the relative position of the components. The distances are not to the scale. (B). The targets were Gabor arcs, centered at the fixation point, f, of different lengths defined in polar coordinates with radius r and angle q. For ease of comparison with the literature, we report the target size in half-height full width (HHFW). The target was embedded in either a parallel (C) or an orthogonal (E) circular mask, as shown in D and F, respectively.
The masks were full circular (ring) Gabor patterns with 3° central radius, achieved by setting the scale parameter σθ to be infinite. The orientation of the mask was either the same as or orthogonal to the orientation of the target. There were six possible mask contrasts: 20 × log10(c) = −∞, −26, −22, −18, −14, or −10 dB. Other than these properties, all other image parameters were the same as those of the target. 
Procedure
We used a spatial two-alternative forced-choice procedure to measure the target threshold. On each trial, the target was presented either to the left or to the right of the fixation. On each trial, a 200-ms long audio tone signaled the start of the trial, followed by a 295-ms stimulus presentation and then by the response interval lasting until a valid response was recorded. The next trial started 800 ms after the response. The task of the observer was to indicate whether the target was to the right or the left side of the display. 
In each threshold run, the mask contrast and the target size were held constant. There were six possible noise contrasts and 10 target sizes, making a total of 60 test conditions. The target contrast in each trial was determined by the Ψ threshold-seeking algorithm (Kontsevich & Tyler, 1999), which was set to measure the threshold at 75% correct response level. There were 40 trials following two practice trials in each run. Each reported datum point was an average of four to eight repeated measures, with error bars representing 1 standard error of the mean (SEM). The sequence of test conditions and their repetitions were all randomized. 
Four observers (three females and one male; ages ranging from late teens to early 20s) participated in the study. They were paid observers naïve to the purpose of the experiment. All participants had normal or corrected-to-normal visual acuity (20/20). 
Results
Spatial summation
Figure 2 shows the spatial summation results, or how target threshold changed with size increments, on parallel masks. The smooth colored curves represent the fits of the multi-stage model described in the Discussion section. When there was no mask (−∞ mask contrast), the threshold first decreased with target size with a log–log slope of approximately −1 (dashed line) until the target arc length reached about 10 arcmin. Beyond this length, the target threshold decreased with a shallower slope approximating −1/2 (dotted line). Such −1/2 slope summation has been commonly observed for spatial summation with periodic patterns (Polat & Tyler, 1999; Chen & Tyler, 2006; Kingdom et al., 2015), faces (Tyler & Chen, 2006), and texts (Kao & Chen, 2012). There were telltale signs that the no-mask functions became even shallower for the largest sizes. Such flattening may be consistent with the −1/4 slope summation observed with spatial summation with random dots (Tyler & Chen, 2006), with periodic patterns extended along the axis orthogonal to its orientation (Robson & Graham, 1981; Polat & Tyler, 1999) or with interleaved full-field stimuli (Meese & Summers, 2007; Baldwin & Meese, 2015). 
Figure 2.
 
Summation functions, or target threshold versus target length functions, for targets on parallel masks. Each row represents the data from one observer. To avoid clutter, the panels in the left column show the summation curves with no mask, those in the central columns with low- to medium-contrast masks (−26, −22, and −18 dB), and those in the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the Discussion section. For ready comparison, dashed lines represent −1 and dotted lines represent −1/2 slope summation.
Figure 2.
 
Summation functions, or target threshold versus target length functions, for targets on parallel masks. Each row represents the data from one observer. To avoid clutter, the panels in the left column show the summation curves with no mask, those in the central columns with low- to medium-contrast masks (−26, −22, and −18 dB), and those in the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the Discussion section. For ready comparison, dashed lines represent −1 and dotted lines represent −1/2 slope summation.
Although the spatial summation curves on the low-contrast masks had a similar form to the unmasked function, those on high contrast parallel masks showed a very different form. The target threshold did decrease with a −1 slope for small target sizes, but the −1/2 slope portion of the summation curve was less evident. That is, when the target arc length was greater than about 10 to 20 arcmin, the target threshold with target length leveled off for awhile. For two of the three observers, the summation curve then decreased again with a slope approaching −1 when the target length further increased beyond about 140 arcmin. Observer YCY did not show such an extra −1 slope component. Due to this individual difference, we tested how stable this extra −1 slope component is. We measured the thresholds of the largest three targets with the presence of the highest contrast target on six more observers (see Supplementary Material). On average, we were able to confirm the existence of this −1 slope component with a larger dataset. 
The spatial summation curves on the orthogonal mask (Figure 3) were generally similar in shape to the no-mask parallel mask: a −1 slope for target size less than 10 to 20 arcmin followed by a −1/2 slope for larger target size. The summation curve for the −10 dB mask, the highest contrast we used, shows signs of flattening toward the right. 
Figure 3.
 
Summation for targets on orthogonal masks. Each row represents the data from two observers. Again, to avoid cluttering of data, the panels on the left column show the summation function with no mask, the central column with low- to medium-contrast masks (−26, −22, and −18 dB), and the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the text. For ready comparison, the dashed and dotted lines represent summation slopes of −1 and −1/2, respectively.
Figure 3.
 
Summation for targets on orthogonal masks. Each row represents the data from two observers. Again, to avoid cluttering of data, the panels on the left column show the summation function with no mask, the central column with low- to medium-contrast masks (−26, −22, and −18 dB), and the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the text. For ready comparison, the dashed and dotted lines represent summation slopes of −1 and −1/2, respectively.
To quantify comparison of these trends, we fit the summation curves of the two extreme conditions: − and −10 dB parallel masks with the generic three-component model of Tyler and Chen (2006) (also see Kao & Chen, 2012) that considers all three common types (slopes of −1, −1/2, and −1/4) of spatial summation (Figure 4). That is, the threshold, th, is given by  
\begin{eqnarray} th &\;=& ( ( {a_1} - \log ( {size} ) )^4 + ( {a_2} - 0.5\log ( \textit{size} ) )^4 \nonumber \\ &&+ ( a_3-0.25\,\log ( \textit{size} ) )^4 )^{1/4} \quad \end{eqnarray}
(2)
 
Figure 4.
 
Example fits of three generic models to thresholds as a function of target size for the no-pedestal (closed circles) and −10-dB contrast pedestal (open squares) conditions. (A) The three-component model (solid curve) of Tyler and Chen (2006). (B) Exploring the model with a size invariant (Equation 3, dotted curve) and an extra −1 slope component (Equation 4, dashed curve) for high pedestal contrast data. The three-component model is replotted here for a reference. The data points plotted here were from the observer YSC shown in Figure 2.
Figure 4.
 
Example fits of three generic models to thresholds as a function of target size for the no-pedestal (closed circles) and −10-dB contrast pedestal (open squares) conditions. (A) The three-component model (solid curve) of Tyler and Chen (2006). (B) Exploring the model with a size invariant (Equation 3, dotted curve) and an extra −1 slope component (Equation 4, dashed curve) for high pedestal contrast data. The three-component model is replotted here for a reference. The data points plotted here were from the observer YSC shown in Figure 2.
This three-component model (thick continuous curve in Figure 4) fits the no-mask summation curve (closed circles) quite well. Across the three observers, the model explained 97% of the variance in the averaged data with root mean square error (RMSE) at 0.82 to 1.02 dB, comparable with the mean SEM of 0.80 to 1.11 dB for the three observers. However, this model cannot fit the high-contrast mask data (open symbols in Figure 4). It underestimates the target thresholds at small target sizes and overestimates them at large sizes (Figure 4a). The RMSE increased to 1.68 to 2.05 dB (compare, SEM = 1.22–1.61) or a 63% to 178% increase from the no-mask conditions. It is noteworthy that removing the −1/2 slope component from this generic model did not affect the fits, as the RMSE was the same up to the second decimal digit, suggesting that the spatial summation curves at high contrast do not have a significant contribution from the −1/2 slope component. 
Because neither the −1/2 nor the −1/4 slope components can capture the summation curves at high contrasts, we replaced them with an invariant component. That is,  
\begin{eqnarray}th = {\left( {{{\left( {{a_1} - \log \left( {size} \right)} \right)}^4} + {{\left( {{a_4}} \right)}^4}} \right)^{1/4}} \quad \end{eqnarray}
(3)
 
This model (dotted curve in Figure 4B) captures thresholds at small to medium size better, but not at large target sizes. The RMSE (1.32–1.84) fared somewhat better than in Equation 2. We then added the extra linear summation component at the large target length range:  
\begin{eqnarray}th^{\prime} = {\left( {t{h^{ - 4}} + {{\left( {{a_5} - \log \left( {size} \right)} \right)}^{ - 4}}} \right)^{ - 1/4}} \quad \end{eqnarray}
(4)
where th is defined in Equation 3. This model captured all features in the data and gave a better fit (RMSE = 1.18–1.62) than Equation 3. Thus, we can conclude that at high mask contrasts, the summation curves mainly exhibit −1 slope for small lengths, flattening in the middle, and conforming to another −1 slope at large target lengths. 
Target threshold versus mask contrast function
Figure 5 shows the data replotted as TvC functions for the three observers, with target size as the parameter. Different colors and symbols represent the TvC functions measured for the various target sizes. (The smooth curves are the fits of our extended model, described below.) With parallel masks, the TvC functions for the large target (length > 36′) showed a shallow dipper shape: the threshold first decreased and then increased with increasing mask contrast. The TvC functions for different target sizes converge at high mask contrasts. Meese (2004) reported a similar convergence of TvC functions at high mask contrasts for his small and large targets on a large mask. When the target size was small, the TvC functions showed less threshold variation with mask contrast than for large size targets. As a result, the TvC functions for small targets were rather flat, manifesting virtually no effect of the presence of the mask even up to the highest contrasts. The TvC functions for the orthogonal masks (Figure 6) showed similar trends as the parallel masks, although their facilitation is less pronounced, consistent with previous reports (e.g., Foley, 1994; Chen & Tyler, 2006). 
Figure 5.
 
The data plotted as target threshold versus mask contrast (TvC) functions for parallel masks in three observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 5.
 
The data plotted as target threshold versus mask contrast (TvC) functions for parallel masks in three observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 6.
 
TvC functions for orthogonal masks in two observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 6.
 
TvC functions for orthogonal masks in two observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Discussion
In this study, we measured the contrast detection threshold of elongated Gabor targets for a wide range of lengths extending over homogeneous retina in the presence of parallel or orthogonal masks of various contrasts. For orthogonal or low-contrast parallel masks, the target contrast detection threshold decreased with target arc length, with a slope approximating −1 on log–log coordinates up to about 10′ to 20′, then with a slope of −1/2 as the size further increased to about 100′, and then with a slope of −1/4 or less up to the largest measured target length. This result is consistent with the spatial summation behavior for various types of stimuli in the literature. 
On high-contrast parallel masks, however, the spatial summation functions were quite different. First, the −1/2 slope region was missing. Instead, detection thresholds approached invariance with length in the same region. Second, there was an extra −1 slope threshold reduction for large target lengths. As discussed in the Introduction, the −1 slope suggests summation within a fixed-size receptive field. Thus, the way the data asymptote to this late −1 slope region implies a target detector with an extra receptive field longer than our longest target. Specifically, any significant slope increase in the data at long lengths implies the presence of an extra mechanism that would generate a slope of −1 at its inflection point (unless there were even longer mechanisms coming into play), with a length about a factor of 3 longer than this inflection point (as estimated from the dotted curve in Figure 4B). With respect to our circular-arc paradigm, this inference essentially implies that it is integrating over the full circle. 
As developed in the Introduction, the usual interpretation of the −1/2 slope summation is derived from the matched filter assumption (Wiener, 1949) or the ideal observer analysis (Tanner & Jones, 1960; Green & Swets, 1966), both of which assume linear summation over the local units contributing to the matched filter. That is, if the observer has complete knowledge of the stimulus and uses a matched filter to detect its presence, both signal and noise level (expressed as variance) in the system increase in proportion with target size (Tyler & Chen, 2000). Thus, the detectability or d’, which is the ratio between the signal and the noise level in standard deviation units, increases with the square root of the target length. As a result, the detection threshold decreases with target length with a slope of −1/2. The elimination of the −1/2 slope region for parallel masks suggests that the observer was no longer able to adopt this matched-filter strategy. This may occur when the high-contrast mask, as a highly salient stimulus determines the spatial extent of the filter used by the observers. Now, if the matched filter is a single receptive field over the whole extent of the mask configuration, the increase of target size would simply increase the overlap between the target and this receptive field, conforming to the case of complete summation as discussed in the Introduction. In this case, we would expect thresholds to decrease with target size with −1 slope over all target lengths in place of the −1/2 slope region. Our data did not show this effect. 
On the other hand, if the performance was the result of attending to the array of local receptive fields specific to the range defined by the target (Tyler & Chen, 2000; Chen et al., 2019), an increase of target size would lead to more activated receptive fields among the total number of receptive fields that are activated by the pedestal. Rather than summing linearly, the ideal attention mechanism selects the local receptive field with the highest SNR on each trial. We thus have the scenario of summation within an ideal attention window, or attentional summation, as mentioned in the Introduction, and the threshold should decrease with the target size with a slope decreasing from the initial value of −1/4. We did observe a flattening of the size summation function for high-contrast parallel pedestals. The models are difficult to distinguish when the slope becomes less than −1/2, but it is clear that the system runs out of receptive fields to match the longer stimuli until it gets to the additional longest one. 
The diminished −1/2 slope region may be due, on the other hand, to the involvement of another process acting in opposition to the ideal summation and thus canceling out the effect. One might find clues to this process in the shape of the TvC functions (Figures 5 and 6), which varied with target size. For large targets, the target threshold first decreased and then increased as the mask contrast increased, forming a weak dipper shape function (Legge & Foley, 1980). The TvC functions for small targets, however, showed a diminished masking effect, resulting in a relatively flat function. The common interpretation of the dipper function is that it reflects an accelerating nonlinearity in the contrast response function of the target mechanism (Nachmias & Sansbury, 1974; Legge & Foley, 1980). Thus, for the present peripheral stimuli, the contrast response function was almost linear at low pedestal contrasts, saturating early for higher pedestal contrasts; for elongated stimuli, it became almost perfectly linear, with neither accelerating nor compressive nonlinearities. 
The threshold increment, or masking condition provided by the pedestal, is often considered to be the result of an inhibitory signal from the mask to the target detection mechanism (Foley, 1994). Thus, in our data, the masking effect increased with target length, suggesting that the mask, despite being an invariant length throughout the experiment, produced a greater inhibition to the target mechanism as the target length increased. Such target-size–dependent inhibition would be sufficient to cancel out the ideal summation. Next, we explain these effects with a quantitative model that provides the data fits for Figures 2 to 6
Extended model
We extended the multiple-stage model for spatial summation proposed by Chen et al. (2019) to account for these new pattern-masking results (Figure 7). The original Chen et al. model contained four elements: (1) a band of small local linear filters operating on the input images; (2) contrast normalization that accounts for pattern detection; (3) ideal summation; and (4) decision making under noise, which generates the theoretical functions plotted in Figures 2 to 6. The combination of elements (1), (2), and (4) matches the model used by Chou, Yeh, and Chen (2014) to account for the TvC functions in the noise-masking paradigm, whereas element (3) accounts for the length summation behavior. To fit the current results, the noise level of element (4) should be held constant, as there was no manipulation of external noise sources in the current experiment. To account for the spatial summation results, we then added a size-dependent summation pool for divisive inhibition to account for the elimination of the ideal summation at high contrasts (5), and a large linear filter followed by contrast normalization to account for the extra −1 slope region of the summation curve at large target size (6). 
Figure 7.
 
Diagram of the gain control model. See text for details.
Figure 7.
 
Diagram of the gain control model. See text for details.
The front end of the model contains a set of local filters that operate on the input images. Each local filter j has a spatial sensitivity profile fj(x,y). The excitation of this linear filter by the ith image component Cigi(x,y), where gi(x,y) defines the contrast independent spatial variation and i indexes either the target or the mask in our experiment, is given as  
\begin{eqnarray}E_{ij}^\prime = {\Sigma _x}{\Sigma _y}{C_i}{g_i}\left( {x,y} \right)\,{f_j}\left( {x,y} \right) \qquad \end{eqnarray}
(5)
 
We assume that the local filters have Gabor sensitivity profiles as defined for Equation 1 in the Methods section, except we replaced the space constant (“standard deviation”) of the stimulus σθ with the space constant for the filter, sθ 
Summing over space, Equation 5, as derived in Chen & Tyler (1999), becomes  
\begin{eqnarray}E_{ji}^\prime {\rm{ }} = S{e_{ji}}{C_i}{\left( {{{s_\theta ^2\sigma _\theta ^2} \big/ {\left( {s_\theta ^2 + \sigma _\theta ^2} \right)}}} \right)^{1/2}} \quad \end{eqnarray}
(5′)
where Seji is a constant depending on the spatial properties, such as spatial frequency or orientation, of both the filter and the target, which thus defines the excitatory sensitivity of the mechanism to the target, and i = t or m for target or mask, respectively. 
The output of the linear filters is halfwave-rectified (Foley, 1994; Teo & Heeger, 1994; Chen & Tyler, 1999; Foley & Chen, 1999) to produce the rectified excitation Eji:  
\begin{eqnarray}{E_{ji}} = \max \left( {E_{ji}^\prime ,\,0} \right) \quad \end{eqnarray}
(6)
where max denotes the operation of choosing the greater of the two terms. 
The response of the jth local filter is given by the excitation of the jth filter, Ej, raised by a power p, in which Ej = ΣiEij is the sum of the excitations produced by all image components, and is then divided by a divisive inhibition term, Ij, plus an additive constant, z:  
\begin{eqnarray} {R_j} = {{E_j^p} \big/ {\left( {{I_j} + z} \right)}} \quad \end{eqnarray}
(7)
where Ij is the summation of a nonlinear combination of the excitations of all relevant filters to filter j. Thus, this divisive inhibition term, Ij, can be represented as  
\begin{eqnarray}{I_j} = {\left( {S{i_{ji}}{C_i}} \right)^q} \quad \end{eqnarray}
(8)
where Sji is the weight of the contributions from each local filter to the inhibition term. Here, we assume that the range of the inhibition sources is stimulus-size dependent. 
A set of second-order detectors pools the responses of the local filters defined in Equation 7 (see also Morgan & Moulden, 1986) at a given location in the visual field. Each second-order detector has a different length summation field size and thus sums different number of local filters, with the kth second-order detector receiving inputs from nk local filters. The overall response of kth second-order detectors, Tk, is then simply  
\begin{eqnarray}{T_k} = \Sigma _{j = 1}^{{n_k}}{R_j} \quad \end{eqnarray}
(9)
 
In our experiment, the observer compares the response to the target + mask(Tk,t+m) and that to the mask alone (Tk,m). The observer can detect the target if the difference between the response in at least one second-order detector is greater than the limitation imposed by the noise. That is, the decision is based on the second-order filter that has the maximum SNR. As shown in Chen et al. (2019), this second-order filter would generally have a receptive field covering the same area as the target. We can thus drop the subscript k and the contribution of the first-order filters, making the decision variable d based on the second-order filter:  
\begin{eqnarray}d = {{\left( {{T_{t + m}}-{T_m}} \right)} \big/ {{\sigma _a}}} \quad \end{eqnarray}
(10)
where σa denotes the magnitude of the noise limiting this second-order filter. 
Because the second-order filter sums linearly the outputs of many local first-order filters, the noise it experiences is the sum of the noises in all the local filters that feed to it. Assume that the noise from each local filter is identically and independently distributed (IID) with variance σa2. The variance of the noise experience by the second-order filter is then nσa2, where n is the number of local filters it monitors and in turn is proportional to the target size. Let n = rS, where r is the scaling factor and S is the target size. The decision variable based on the second-order filter is then:  
\begin{eqnarray} \begin{array}{@{}l@{\;}c@{\;}l@{}} d &=& n{\rm{ }}\left( {{R_{t + m}} - {R_m}} \right)/{(n\,\sigma _a^2)^{1/2}}\\ &=& {n^{1/2}}({R_{t + m}} - {R_m})/{\sigma _a} \end{array} \quad \end{eqnarray}
(11)
 
To avoid overfitting, we simply set σa to be 1, as any non-unity constant here can be absorbed into other parameters. We also set n = 1 if r × S is smaller than 1. 
For large targets, the empirical data suggest the involvement of a single detector with a large, elongated receptive field. This can be implemented with the same formulation as Equations 2 to 7 with a larger sj in Equation 1 and n = 1 in Equation 7. Let dL, replacing d in Equation 7, be the contribution of this large detector to the decision. The final decision variable is then:  
\begin{eqnarray}d^{\prime} = {\left( {{d^4} + d_L^4} \right)^{1/4}} \quad \end{eqnarray}
(12)
 
In practice, to further avoid overdetermination in the fits, we set the parameter Se to the target, Set, to 100. In total, there were thus just eight free parameters in the model for the summation mechanism to account for the 60 datum points for one observer: excitatory sensitivity to the mask (Sem), inhibitory sensitivity of the local detector to the target and the mask (Sit and Sim, respectively, in Equation 8), the exponents and additive constant of the nonlinear response function (p, q, and z, respectively, in Equation 7), the size of the local filter (σθ in Equation 7), and the scale parameter (r in Equation 11). In addition, for parallel masks only, there are five free parameters for the extra-long receptive field (Sem, Sit, Sim, z, and σθ). 
We used the Powell method (Press, Flannery, Teukolsky, & Vetterling, 1988) to find the least square fits to the data. As shown by the smooth curves in Figures 5 and 6, this model fits the data well. The RMSE across the curves was between 1.40 and 1.72 dB, comparable to the mean SEM of 1.24 dB, and it explained 92% to 93% of all variance in the data across the five datasets. Table 1 summarizes the fitted parameter value and goodness of fit of each participant. 
Table 1.
 
Best-fit parameter values and goodness-of-fit for each participant. See text for the meaning of the parameters. * Fixed parameter.
Table 1.
 
Best-fit parameter values and goodness-of-fit for each participant. See text for the meaning of the parameters. * Fixed parameter.
Now, how does this model account for different aspects of the data? Let us first consider the no-mask condition. In this condition, the inhibition term, I in Equation 7, would be quite small and negligible compared to the additive constant z. For very small targets, we can ignore the largest filter, so that n is about 1. Thus, the target is at threshold if the decision variable d = 1 or Ep/z = 1. Plugging Equation 5 into this relationship and rearranging the terms, the threshold Ct (setting i = t in Equation 5) occurs when  
\begin{eqnarray} {C_t} = {{{z^{1/p}}} \big/ {S{e_{ji}}{{\left( {{{\left( {s_\theta ^2 + \sigma_\theta ^2} \right)} \big/ {\left( {s_\theta ^2 \times \sigma_\theta ^2} \right)}}} \right)}^{1/2}}}} \quad \end{eqnarray}
(13)
 
If the target size sθ is small, Ct is approximated by (z1/p/Seji)sθ–1. This expression gives the −1 slope region of the summation function at small target sizes. 
When the target size is sufficiently large, on the other hand, the right-hand side of Equation 13 would approach a constant and would thus have no further effect on target threshold. On the other hand, now that n is greater than 1, the threshold is then determined by Equation 11, such that the decision variable increases with the square root of target size. Correspondingly, the threshold decreases with the square root of target size, producing the −1/2 slope region of the summation curves. 
Notice that the above arguments for the −1 and −1/2 slope components apply not only for the no-mask condition but also for the orthogonal-mask conditions, because orthogonal masks produce only weak, if any, divisive inhibition on the target detector (Foley, 1994). 
A parallel mask with sufficiently high contrast, on the other hand, would produce a strong divisive inhibition signal to the target detector. Thus, now the additive constant z is negligible compared to the divisive inhibition I, and the response Rj of Equation 7 is approximately Ejp /Ij, or with Equation 8 plugged in,  
\begin{eqnarray}{R_j} = {{E_j^p} \big/ {\left( {\sum\nolimits_i {{{\left( {S{i_{ji}}{C_i}} \right)}^q}} } \right)}} \quad \end{eqnarray}
(14)
 
The range of summation has not been clearly defined in the past psychophysical studies. If the number of channels that contribute to the divisive inhibition increases with the square root of target size, then Equation 10 becomes  
\begin{eqnarray}{R_j} = {{E_j^p} \big/ {\left( {{n^{1/2}}{{\left( {S{i_i}{C_i}} \right)}^q}} \right)}} \quad \end{eqnarray}
(15)
 
Also, plugging Equation 15 back to Equation 11, the decision variable becomes  
\begin{eqnarray} d &\;=& ( {{( {{E_{t + m}}} )}^p}/( {{n^{1/2}}{{( {S{i_{t + m}}{C_i}} )}^q}} ) \nonumber \\ && -\, {{( {{E_m}} )}^p}/( {{\rm{ }}{n^{1/2}}{{( {S{i_m}{C_i}} )}^q}} ) )/({n^{1/2}}{\sigma _j}) \nonumber \\ &\;=& ( {{({E_{t + m}})}^p}/( {{\rm{ }}{{( {S{i_{t + m}}{C_i}} )}^q}} ) \nonumber \\ &&-\, {{( {{E_m}} )}^p}/( {{{( {S{i_m}{C_i}} )}^q}} ) )/({\sigma _j}) \quad \end{eqnarray}
(16)
which is independent of target size and thus produces the flat region of the summation curve that replaces the −1/2 slope ideal summation. This size-dependent inhibition also explains why threshold increments at high mask contrasts also increase with target size. Thus, what our data imply is that the second-order filter sums up the responses not only across a band of local filters but also across the divisive inhibition pools. Meese (2004) also suggested that the divisive inhibition can increase with the target size. However, because there were only two target sizes in his experiment and the model was relatively simple, generality may be limited. 
Another feature of our data is that the TvC functions for the smaller targets, compared with those for the larger target, were relatively flat: the dipper was less pronounced and the masking effect was smaller. As discussed by many authors (Nachmias & Sansbury, 1974; Stromeyer & Klein, 1974; Legge & Foley, 1980; Foley, 1994; Chen & Tyler, 2001), the dipper is due to the accelerative nonlinearity in the contrast response function. In the context of the current model, when contrast is low, the divisive inhibition (Ij in Equation 7) would be negligible compared to the additive constant (Z in Equation 7) in the denominator. Thus, the response function and, in turn, the decision variable would be dominated by the excitation of the linear filters. Taking size into consideration, the decision variable (Equation 11) is proportional to n1/2Cp. The TvC function would have shown a dipper if p was greater than 1. Also, such excitation increases with size (proportional to n). We thus observed that the size of the dipper increased with target size. On the other hand, at high contrast, as shown in Equation 15, the response function is dominated by the ratio between the excitation and the divisive inhibition, and the decision variable is approximately invariant with target size. That is, the threshold at high pedestal contrast should be similar regardless target size. Thus, the threshold at the high pedestal contrast for the larger targets increased considerably to match the threshold level of the smaller targets. As a result, with a smaller dipper and a smaller masking, the TvC functions for smaller target were rather flat. 
In addition to the inhibitory second-order filter, the size-dependent inhibition may also come from long-range interactions among neighboring neurons. As target length increases, it inevitably covers several different local receptive fields. With single-cell recording, it has been shown that a colinear stimulus outside the classical receptive field of a V1 neuron can facilitate the cell response at low contrast but suppress it at high contrasts (Hubel & Wiesel, 1968; Polat et al., 1998; Gilbert, Ito, Kapadia, & Westheimer, 2000; Chen et al., 2001). Similarly, the response of a neuron to a high-contrast target decreases with target length when it increases beyond the classical receptive field of the target neuron, but such end-stopping behavior is not found for low-contrast targets (Kapadia, Westheimer & Gilbert, 1999). In psychophysics, it has also been shown that the colinear lateral stimulus can hinder target detection at high contrasts but not at low contrasts (Chen & Tyler, 2001; Chen & Tyler, 2008). Such contrast-dependent long-range interactions, which also increase with contrast and target size, are also consistent with the size-dependent inhibition in our model and may well be the underlying mechanisms for the inhibitory second-order filter. 
There are other multi-stage models that have been proposed to account for spatial summation (Baker & Meese, 2011; Meese & Summers 2012; Kingdom et al., 2015; Meese & Baker, 2023). The models differ in detail, but they all share the same general structure, with a band of local filters followed by a nonlinear transform and an array of summation mechanisms pooling the nonlinear responses of a number of local filters. In particular, Meese and Summers (2007) also incorporated local contrast gain control into their model to account for the masking effects with various target–mask size combinations. However, because their experiments were designed to hold the possible inhibitory signal from the target constant, to allow an estimation of spatial summation, their model was not designed to address the elimination of ideal summation that we find at high mask contrasts. Furthermore, the numerator and the denominator of their response function were drawn from the same range of local channels. From our discussion of Equation 13, this formulation cannot produce a size-invariant target threshold. 
Conclusions
In this study, we measured how elongated spatial summation behavior changes in the presence of a pattern context on homogeneous peripheral retina. Without such pattern context, the target threshold first decreases with size with slope −1 with target length until reaching a critical value of about 10 arcmin and then further decreases with slope −1/2 on log–log coordinates before turning into a slope of −1/4 or less for the largest target lengths. Similar summation behavior is also found in the orthogonal mask conditions, as has been reported in the literature (e.g., Chen, Yeh & Tyler, 2019). However, the presence of a high-contrast parallel mask eliminates the −1/2 slope behavior from the summation curve. We also find evidence for an extra −1 slope range at large target sizes of 5° or more, suggesting the existence of an extra-long channel that is not revealed by the conventional spatial summation paradigm. As a result, the summation curve has a −1 slope at small sizes, flattens in the middle, and approaches a second −1 slope at large target sizes. These conclusions refer specifically to the eccentricity of 3° and the Gabor carrier frequency of 2.5 c/°, but we expect them be representative of the rest of the retina when scaled by the usual cortical magnification factor and to generalize to other spatial frequencies. 
Although a −1/2 slope region of a summation curve is normally considered to be a signature of ideal summation, the elimination of this component implied by the data cannot be attributed to a suboptimal strategy taken by the observer in suprathreshold environments. Instead, as suggested by the way that the shape of the of TvC functions changes with target size, it is attributable to a size-dependent inhibition that is most effective at high contrasts. Thus, our results can best be explained by a divisive inhibition model in which second-order filters sum responses across local channels, which are modeled by the linear excitation of the local channels raised by a power and scaled by divisive inhibition from all image components. Such divisive inhibition from the high-contrast iso-orientation masks swamps the response and eliminates the target size effect for ideal summation. 
The single long-range channel revealed by parallel masks extends over much of one hemifield in each case (left or right), as the HHFW is more than 90°. This can be seen as psychophysical validation in humans of the extended spatial receptive fields described by DeAngelis, Freeman, and Ohzawa (1994) and by Gilbert and Wiesel (1989) in cats, which extend over as much as 20° of visual angle. These kinds of neural receptive fields would obviously be very useful in completing faint or broken contours in natural images, even when of opposite sign along the contour (Marr, 1982). 
Acknowledgments
Supported by a grant from the Taiwan Ministry of Science and Technology (108-2410-H-002-105 -MY2 to C-CC). 
Commercial relationships: none. 
Corresponding author: Chien-Chung Chen. 
Email: c3chen@ntu.edu.tw. 
Address: Department of Psychology, National Taiwan University, Taipei, Taiwan. 
Footnotes
1  Note that the summation process per se may be considered ideal even if the detection threshold for each filter is inefficient due to internal noise or incomplete sampling of the stimulus, for example.
References
Baker D. H., Meese T. S. (2011). Contrast integration over area is extensive: A three-stage model of spatial summation. Journal of Vision, 11(14), pii:14, doi:10.1167/11.14.14.
Baldwin A. S., Meese T. S. (2015). Fourth-root summation of contrast over area: No end in sight when spatially inhomogeneous sensitivity is compensated by a witch's hat. Journal of Vision, 15(15):4, 1–12, https://doi.org/10.1167/15.15.4. [CrossRef] [PubMed]
Barlow, H. B. (1958). Temporal and spatial summation in human vision at different background intensities. Journal of Physiology, 141(2), 337–350. [CrossRef]
Baumgardt, E. (1959). Visual spatial and temporal summation. Nature, 184, 1951–1952. [CrossRef]
Chen, C.-C., & Tyler, C. W. (1999). Accurate approximation to the extreme order statistics of Gaussian samples. Communication in Statistics: Simulation and Computation, 28, 177–188. [CrossRef]
Chen, C. C., & Tyler, C. W. (1999). Spatial pattern summation is phase-insensitive in the fovea but not in the periphery Spatial Vision, 12, 267–285. [PubMed]
Chen, C. C., Kasamatsu, T., Polat, U., & Norcia, A. M. (2001). Contrast response characteristics of long-range lateral interactions in cat striate cortex Neuroreport, 12(4), 655–661, https://doi.org/10.1097/00001756-200103260-00008. [CrossRef] [PubMed]
Chen, C. C., & Tyler, C. W. (2006). Evidence for elongated receptive field structure for mechanisms subserving stereopsis. Vision Research, 46(17), 2691–2702, https://doi.org/10.1016/j.visres.2006.02.009. [CrossRef] [PubMed]
Chen, C. C., & Tyler, C. W. (2008). Excitatory and inhibitory interaction fields of flankers revealed by contrast-masking functions Journal of Vision, 8(4):10, 1–14. [CrossRef] [PubMed]
Chen, C. C., Yeh, Y. H., & Tyler, C. W. (2019). Length summation in noise. Journal of Vision, 19(9):11, 1–13, https://doi.org/10.1167/19.9.11. [CrossRef]
Chen, C. C., & Tyler, C. W. (2001). Lateral sensitivity modulation explains the flanker effect in contrast discrimination. The Proceedings of the Royal Society (London) Series B, 268, 509–516. [CrossRef]
Chou, Y. L., Yeh, S. L., & Chen, C. C. (2014). Distinct mechanisms subserve location- and object-based visual attention. Frontiers in Psychology, 5, 456. [CrossRef] [PubMed]
DeAngelis, G. C., Freeman, R. D., & Ohzawa, I. (1994). Length and width tuning of neurons in the cat's primary visual cortex. Journal of Neurophysiology, 71(1), 347–374. [CrossRef] [PubMed]
Fechner, G. T. (1860). Elemente der Psychophysik. Leipzig: Breitkopf und Härtel.
Foley, J. M. (1994). Human luminance pattern-vision mechanisms: Masking experiments require a new model. Journal of the Optical Society of America A, 11, 1710–1719. [CrossRef]
Foley, J. M., & Chen, C. C. (1999). Pattern detection in the presence of maskers that differ in spatial phase and temporal offset: Threshold measurements and a model. Vision Research, 39, 3855–3872. [CrossRef] [PubMed]
Gilbert, C., Ito, M., Kapadia, M., & Westheimer, G. (2000). Interactions between attention, context and learning in primary visual cortex. Vision Research, 40(10–12), 1217–1226. [PubMed]
Gilbert, C. D., & Wiesel, T. N. (1989). Columnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. Journal of Neuroscience, 9(7), 2432–2442. [CrossRef]
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243. [CrossRef] [PubMed]
Kao, C. H. & Chen, C. C. (2012). Seeing visual word forms: Special summation, eccentricity and spatial configuration. Vision Research, 62, 57–65. [CrossRef] [PubMed]
Kapadia, M. K., Westheimer, G., & Gilbert, C. D. (1999). Dynamics of spatial summation in primary visual cortex of alert monkeys. Proceedings of the National Academy of Sciences, USA, 96(21), 12073–12078. [CrossRef]
Kingdom, F. A. A., Baldwin, A. S., & Schmidtmann, G. (2015). Modeling probability and additive summation for detection across multiple mechanisms under the assumptions of signal detection theory. Journal of Vision, 15(5):1, 1–16, https://doi.org/10.1167/15.5.1. [CrossRef] [PubMed]
Kontsevich, L. L., & Tyler, C. W. (1999) Bayesian adaptive estimation of psychometric slope and threshold. Vision Research, 39, 2729–2737. [CrossRef] [PubMed]
Legge, G. E., & Foley, J. M. (1980). Contrast masking in human vision. Journal of the Optical Society of America, 70, 1458–1470. [CrossRef] [PubMed]
Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco, CA: W.H. Freeman and Company.
Meese, T. S. (2004). Area summation and masking. Journal of Vision, 4, 930–943, https://doi.org/10.1167/4.10.8. [PubMed]
Meese T. S., & Baker D. H. (2023). Object image size is a fundamental coding dimension in human vision: New insights and model. Neuroscience, 514, 79–91. [CrossRef] [PubMed]
Meese T. S., & Summers R. J. (2007). Area summation in human vision at and above detection threshold. Proceedings of the Royal Society B: Biological Sciences, 274(1627), 2891–900. [CrossRef]
Meese, T. S., & Summers, R. J. (2012). Theory and data for area summation of contrast with and without uncertainty: Evidence for a noisy energy model. Journal of Vision, 12 (11):9, 1–28, https://doi.org/10.1167/12.11.9. [CrossRef]
Morgan, M. J., & Moulden, B. (1986). The Münsterberg figure and twisted cords. Vision Research, 26(11), 1793–1800. [CrossRef] [PubMed]
Nachmias J., & Sansbury, R. V. (1974). Grating contrast: Discrimination may be better than detection. Vision Research, 14, 1039–1042. [CrossRef] [PubMed]
Pelli, D. G. (1985). Uncertainty explains many aspects of visual contrast detection and discrimination. Journal of the Optical Society of America A, 2(9): 1508–32. [CrossRef]
Polat, U., & Tyler, C. W. (1999). What pattern the eye sees best. Vision Research, 39(5), 887–895. [CrossRef] [PubMed]
Polat, U., Mizobe, K., Pettet, M. W., Kasamatsu, T., & Norcia, A. M. (1998). Collinear stimuli regulate visual responses depending on cell's contrast threshold. Nature, 391(6667), 580–584, https://doi.org/10.1038/35372. [CrossRef] [PubMed]
Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1988). Numerical recipes in C. Cambridge, UK: Cambridge University Press.
Robson, J. G., & Graham, N. (1981). Probability summation and regional variation in contrast sensitivity across the visual-field. Vision Research, 21, 409–418. [CrossRef] [PubMed]
Stiles W. S. (1959). Color vision: The approach through increment-threshold sensitivity. Proceedings of the National Academy of Sciences, USA, 45(1), 100–114. [CrossRef]
Stromeyer C. F., & Klein S. (1974). Spatial frequency channels in human vision as asymmetric (edge) mechanisms. Vision Research, 14(12), 1409–1420. [CrossRef] [PubMed]
Tanner, W. P., Jr., & Jones, R. L. (1960). The ideal sensor system as approached through statistical decision theory and the theory of signal detectability. In: Morris, A. & Horne, E. P. (Eds.), Visual search techniques (NAS-NRC Publication No. 712, pp. 59–68). Washington, DC: National Academy of Science.
Teo, P. C., & Heeger, D. J. (1994). Perceptual image distortion. SPIE Proceedings, 2179, 127–141.
Tyler, C. W., & Chen, C. C. (2000). Signal detection theory in the 2AFC paradigm: Attention, channel uncertainty and probability summation. Vision Research, 40, 3121–3144. [CrossRef] [PubMed]
Tyler, C. W. & Chen, C. C. (2006). Spatial summation of face information. Journal of Vision, 6, 1117–1125, https://doi.org/10.1167/6.10.11. [CrossRef] [PubMed]
Watson, A. B., Barlow, H. B., & Robson, J. G. (1983). What does the eye see best? Nature, 302(5907), 419–422. [CrossRef] [PubMed]
Wiener, N. (1949). Extrapolation, interpolation and smoothing of stationary time series, with engineering applications. Cambridge, MA: MIT Press.
Figure 1.
 
(A) Diagram for the apparatus setup showing the relative position of the components. The distances are not to the scale. (B). The targets were Gabor arcs, centered at the fixation point, f, of different lengths defined in polar coordinates with radius r and angle q. For ease of comparison with the literature, we report the target size in half-height full width (HHFW). The target was embedded in either a parallel (C) or an orthogonal (E) circular mask, as shown in D and F, respectively.
Figure 1.
 
(A) Diagram for the apparatus setup showing the relative position of the components. The distances are not to the scale. (B). The targets were Gabor arcs, centered at the fixation point, f, of different lengths defined in polar coordinates with radius r and angle q. For ease of comparison with the literature, we report the target size in half-height full width (HHFW). The target was embedded in either a parallel (C) or an orthogonal (E) circular mask, as shown in D and F, respectively.
Figure 2.
 
Summation functions, or target threshold versus target length functions, for targets on parallel masks. Each row represents the data from one observer. To avoid clutter, the panels in the left column show the summation curves with no mask, those in the central columns with low- to medium-contrast masks (−26, −22, and −18 dB), and those in the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the Discussion section. For ready comparison, dashed lines represent −1 and dotted lines represent −1/2 slope summation.
Figure 2.
 
Summation functions, or target threshold versus target length functions, for targets on parallel masks. Each row represents the data from one observer. To avoid clutter, the panels in the left column show the summation curves with no mask, those in the central columns with low- to medium-contrast masks (−26, −22, and −18 dB), and those in the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the Discussion section. For ready comparison, dashed lines represent −1 and dotted lines represent −1/2 slope summation.
Figure 3.
 
Summation for targets on orthogonal masks. Each row represents the data from two observers. Again, to avoid cluttering of data, the panels on the left column show the summation function with no mask, the central column with low- to medium-contrast masks (−26, −22, and −18 dB), and the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the text. For ready comparison, the dashed and dotted lines represent summation slopes of −1 and −1/2, respectively.
Figure 3.
 
Summation for targets on orthogonal masks. Each row represents the data from two observers. Again, to avoid cluttering of data, the panels on the left column show the summation function with no mask, the central column with low- to medium-contrast masks (−26, −22, and −18 dB), and the right column with high-contrast masks (−14 and −10 dB). Error bars represent 1 SEM. Smooth colored curves represent the fits of the multi-stage model described in the text. For ready comparison, the dashed and dotted lines represent summation slopes of −1 and −1/2, respectively.
Figure 4.
 
Example fits of three generic models to thresholds as a function of target size for the no-pedestal (closed circles) and −10-dB contrast pedestal (open squares) conditions. (A) The three-component model (solid curve) of Tyler and Chen (2006). (B) Exploring the model with a size invariant (Equation 3, dotted curve) and an extra −1 slope component (Equation 4, dashed curve) for high pedestal contrast data. The three-component model is replotted here for a reference. The data points plotted here were from the observer YSC shown in Figure 2.
Figure 4.
 
Example fits of three generic models to thresholds as a function of target size for the no-pedestal (closed circles) and −10-dB contrast pedestal (open squares) conditions. (A) The three-component model (solid curve) of Tyler and Chen (2006). (B) Exploring the model with a size invariant (Equation 3, dotted curve) and an extra −1 slope component (Equation 4, dashed curve) for high pedestal contrast data. The three-component model is replotted here for a reference. The data points plotted here were from the observer YSC shown in Figure 2.
Figure 5.
 
The data plotted as target threshold versus mask contrast (TvC) functions for parallel masks in three observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 5.
 
The data plotted as target threshold versus mask contrast (TvC) functions for parallel masks in three observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 6.
 
TvC functions for orthogonal masks in two observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 6.
 
TvC functions for orthogonal masks in two observers and at various target sizes (see legend). Error bars represent 1 SEM. Smooth curves are the fit of the multi-stage model described in the text.
Figure 7.
 
Diagram of the gain control model. See text for details.
Figure 7.
 
Diagram of the gain control model. See text for details.
Table 1.
 
Best-fit parameter values and goodness-of-fit for each participant. See text for the meaning of the parameters. * Fixed parameter.
Table 1.
 
Best-fit parameter values and goodness-of-fit for each participant. See text for the meaning of the parameters. * Fixed parameter.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×