Journal of Vision Cover Image for Volume 17, Issue 12
October 2017
Volume 17, Issue 12
Open Access
Article  |   November 2017
Modeling grating contrast discrimination dippers: The role of surround suppression
Author Affiliations
Journal of Vision November 2017, Vol.17, 23. doi:https://doi.org/10.1167/17.12.23
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Michelle P. S. To, Mazviita Chirimuuta, David J. Tolhurst; Modeling grating contrast discrimination dippers: The role of surround suppression. Journal of Vision 2017;17(12):23. https://doi.org/10.1167/17.12.23.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

We consider the role of nonlinear inhibition in physiologically realistic multineuronal models of V1 to predict the dipper functions from contrast discrimination experiments with sinusoidal gratings of different geometries. The dip in dipper functions has been attributed to an expansive transducer function, which itself is attributed to two nonlinear inhibitory mechanisms: contrast normalization and surround suppression. We ran five contrast discrimination experiments, with targets and masks of different sizes and configurations: small Gabor target/small mask, small target/large mask, large target/large mask, small target/in-phase annular mask, and small target/out-of-phase annular mask. Our V1 modeling shows that the results for small Gabor target/small mask, small target/large mask, large target/large mask configurations are easily explained only if the model includes surround suppression. This is compatible with the finding that an in-phase annular mask generates only little threshold elevation while the out-of-phase mask was more effective. Surrounding mask gratings cannot be equated with surround suppression at the receptive-field level. We examine whether normalization and surround suppression occur simultaneously (parallel model) or sequentially (a better reflection of neurophysiology). The Akaike Criterion Difference showed that the sequential model was better than the parallel, but the difference was small. The large target/large mask dipper experiment was not well fit by our models, and we suggest that this may reflect selective attention for its uniquely larger test stimulus. The best-fit model replicates some behaviors of single V1 neurons, such as the decrease in receptive-field size with increasing contrast.

Introduction
One approach to building a computational model of how human observers can detect or discriminate between natural or laboratory stimuli requires a good understanding of the physiological mechanisms that underlie visual processing (e.g., Rohaly, Ahumada, & Watson, 1997; To, Lovell, Troscianko, & Tolhurst, 2010; Watson & Solomon, 1997). Contrast discrimination has been extensively studied since the early masking experiments by Campbell and Kulikowski (1966) and the detailed evaluation of dipper functions by Legge and Foley (1980), and it has been widely used to test computational models (Foley, 1994; Tolhurst et al., 2010; Watson & Solomon, 1997). While a straightforward model, such as Weber's law, may describe various types of sensory discrimination, it cannot explain the nonlinear dip that is consistently present in the results of contrast discrimination experiments. The source of the dip has been attributed to a sigmoidal transducer function caused by contrast normalization (a.k.a., contrast gain control) in the underlying physiological processing (Carandini, Heeger, & Movshon, 1997; Heeger, 1992), and its inclusion has been shown to improve models of psychophysical performance (Foley, 1994; Rohaly et al., 1997; Watson & Solomon, 1997). 
Now, although contrast normalization improves the fit of models of contrast discrimination, it only does so to a limited degree. The shape of the dipper function is not invariant and it changes depending on the geometrical configuration of the stimuli presented (Legge & Foley, 1980; Meese, 2004). This suggests that contrast normalization may not be the only nonlinear factor causing the dip. Meese (2004) ran a series of contrast discrimination experiments comparing sinusoidal patches of different sizes (small test vs. large mask [SL], small vs. small [SS], large vs. large [LL]) and concluded that the differently shaped dippers could be explained by the existence of surround suppression, a phenomenon long known in neurophysiology (Blakemore & Tobin, 1972). V1 neurons have a “classical” receptive field where simple light stimuli of appropriate orientation cause changes in activity, surrounded by a region or regions where stimuli suppress the activity generated by the stimuli falling in the classical field (see Figure 1A). In addition to these two areas, there is an ambiguous overlap zone (Cavanaugh, Bair, & Movshon, 2002; Sceniak, Ringach, Hawken, & Shapley, 1999). 
Figure 1
 
Panel (A) schematically presents the structure of a V1 receptive field. The central classical field (green) is a direct summation field that increases neuronal activity when appropriately stimulated, while the surrounding area (red) is a suppressive zone. Stimuli presented here do not affect neuronal firing when presented alone, but they do suppress the responses elicited by simultaneous activation of the classical center. The so-called “surround” probably overlaps with the classical center so that there is an ambiguous overlap zone, which may preclude exact measurement of the dimensions of the classical field. Panel (B) shows a central target patch and a surround mask, both of which are stimuli that are commonly used in psychophysical contrast discrimination experiments. Panel (C) demonstrates how the target and mask are projected onto various small neuronal receptive fields a, b, c, d, and e.
Figure 1
 
Panel (A) schematically presents the structure of a V1 receptive field. The central classical field (green) is a direct summation field that increases neuronal activity when appropriately stimulated, while the surrounding area (red) is a suppressive zone. Stimuli presented here do not affect neuronal firing when presented alone, but they do suppress the responses elicited by simultaneous activation of the classical center. The so-called “surround” probably overlaps with the classical center so that there is an ambiguous overlap zone, which may preclude exact measurement of the dimensions of the classical field. Panel (B) shows a central target patch and a surround mask, both of which are stimuli that are commonly used in psychophysical contrast discrimination experiments. Panel (C) demonstrates how the target and mask are projected onto various small neuronal receptive fields a, b, c, d, and e.
Although Meese's (2004) model described his results well, it is a simple arithmetic model with simple scalars describing center contrast and surround contrast, and it therefore lacks the physiological features (e.g., receptive fields, orientation specificity, etc.) present in more complete and realistic V1 models, such as Watson and Solomon's (1997). The latter computes how individual neurons with small receptive fields and different stimulus preferences respond at different points within a stimulus; this is particularly important when the contrast within a stimulus is not spatially uniform throughout (Meese's SS and SL conditions) so that a single number representing contrast may be misleading. The present article will investigate whether adding a surround suppression algorithm at the single neuron level to a V1 model is needed to explain Meese's (2004) psychophysical findings, which we replicate. 
We need to distinguish the psychophysical design where a surrounding grating might mask a central test patch (Meese, 2004; Petrov, Carandini, & McKee, 2005) from the neuron-by-neuron situation where a single neuron's response is suppressed by stimuli immediately around its receptive field (Blakemore & Tobin, 1972). In psychophysical contrast discrimination experiments, a central patch of test grating might be surrounded by an annular pattern of masking grating. It is tempting to equate the central and surround gratings to the classical center and surround of individual V1 neurons. However, the central grating patch is unlikely to be precisely the same size as one V1 receptive field (Figure 1B and C). The central target patch and surround mask (e.g., Figure 1B) project onto many individual receptive fields of various neurons (e.g., a–e in Figure 1C), each of which is affected to a different degree by the central-test and surround-mask gratings. Neuron a does have its classical center in the scope of the central test grating and part of its surround in the scope of the surround gratings. But, for instance, neuron b (in the center of the test patch) will be unaffected by the presence or absence of the surround mask. Other neurons (c–e) have their fields disposed in quite different ways from a matching of center to center and surround to surround. Field c is worth noting. Its classical center straddles the border between test and mask gratings so that the mask might have two opposing effects: While the mask might cause nonlinear suppression through the field's surround, it might also increase neural response by activating part of the neuron's classical center. 
There is a suggestion that surround suppression is only present outside the fovea (Petrov et al., 2005; Xing & Heeger, 2000); foveal test patches seem not to be masked by surrounding annular masks. This seems to be at variance with Meese's (2004) clear suggestion that surround suppression is needed to explain his foveal results. No neurophysiological study has reported that surround suppression is found only in peripheral visual field. Using our own dipper experiments, we will develop our model of a large population of V1 neurons (To et al., 2010; To, Gilchrist, & Tolhurst, 2015; To, Gilchrist, Troscianko, & Tolhurst, 2011) to ask whether the apparently conflicting views of Meese (2004) and of Petrov et al. (2005) could, in fact, be compatible. As Figure 1C shows, the many V1 neurons involved in the detection of a central test grating patch are likely to be affected in different, conflicting ways by the surrounding mask. Only a detailed V1 model of many neurons can determine the net result. 
When implementing surround suppression in a more realistic V1 model, one needs to consider when this inhibition takes place in relation to normalization. Although our previous models (To et al., 2010; Tolhurst et al., 2010) have assumed that normalization and surround inhibition occur simultaneously (at a single Naka-Rushton step, Equation 3 in Methods), there is strong physiological (DeAngelis, Freeman, & Ohzawa, 1994; DeAngelis, Robson, Ohzawa, & Freeman, 1992; Durand, Freeman, & Carandi, 2007; Henry et al., 2013; Li & Freeman, 2011; Li, Thompson, Duong, Peterson, & Freeman, 2006) and psychophysical evidence (Baker, Meese, & Summers, 2007; Petrov et al., 2005; Schallmo & Murray, 2013) to suggest that these two kinds of suppression are sequential processes. Some evidence suggests that normalization occurs at the level of the lateral geniculate nucleus (LGN), whereas surround suppression occurs in V1 (Li et al., 2006). The different effects of binocular interaction, the different susceptibility to adaptation, the difference in temporal tuning and explicit measures of onset latency all point to normalization occurring prior to surround suppression. In this article, therefore, we model the two processes as occurring as two sequential steps (Equations 4 and 5 in Methods). We will report both the (old) parallel and (new) sequential models to see which would provide better predictions for our psychophysical dipper data. 
In addition to investigating surround suppression, we will also study the shape of the Gabor receptive fields that model V1 simple cells. In our past models, we have assumed that receptive fields are elongated by a factor of 1.5 along the long axis of the sinusoid carrier stripes, based on one neurophysiological study in cat visual cortex (Tolhurst & Thompson, 1981). Foley, Varadharajan, Koh, and Farias (2007) found that slightly elongated fields proved a better fit to their psychophysical results than did the usual Gabor model with a circularly symmetric Gaussian envelope, reporting elongation factors more like 1.2 than our 1.5. For this and other reasons given in the Results, we have decided to set receptive field aspect ratio as an extra free parameter in the present models. 
Finally, we examine the behavior of single “neurons” within the full multineuronal model. Fitting any model to psychophysical data will result in the generation of values for several possibly obscure parameters whose individual effects are difficult to identify and discern. Examining single “neuron” behavior in our final model allows us to verify whether the final combination of those obscure parameter values does give rise to behavior similar to that of the real V1 neurons (e.g., Cavanaugh et al., 2002; Sceniak et al., 1999) on which the starting point of our modeling is based. 
Methods
Experimental methods
All stimuli were presented on a SONY 19-in. color monitor driven by a VSG 2/4 graphics card (Cambridge Research Systems, Rochester, UK) to a resolution of 800 pixels wide by 600 pixels high. Observers sat in a dimly lit room at a distance of 114 cm from the screen, which was 18.50° (37 cm) wide by 14° (28 cm) high. Viewing was binocular and observers were free to move their eyes. The screen had a mean luminance of 44cd/m−2, sufficiently bright to be in the photopic range. 
Stimuli
Several target and mask stimuli were used as test and masking stimuli in a series of five contrast discrimination experiments. Some examples of the stimulus elements are given in Figure 2. Our stimuli were based on vertical sinusoidal gratings of 2.57 c/° as targets and masks. There were 16 pixels for each cycle of the sinewave. The targets and masks were of different sizes and configurations and could be combined in the following ways: 
  •  
    Small target and small mask (SS). All the “S” targets had a circularly symmetric Gaussian weight with standard deviation of 16 pixels (0.77°) as in Figure 2A.
  •  
    Large target and Large mask (LL). See Figure 2B.
  •  
    Small target as in (a) in the center of a large mask (SL). See Figure 2C. The large grating subtended a square of 6° in the center of the display, but with its edges blended into the midgray of the surrounding display.
  •  
    Small target with an in-phase annular mask grating. The mask was a large square-shaped grating but with a Gaussian-weighted circular “hole” in the middle (see Figure 2D); this Gaussian had a standard deviation of 32 pixels, twice that of the small Gabor target (Figure 2E). This mask had the carrier sinusoid spatially in phase with the carrier of the small target.
  •  
    Small target with an out-of-phase annular mask again, but where the sinusoids were spatially 180° out of phase (Figure 2F).
Figure 2
 
Examples of stimulus templates used in the experiments. (A) the small S Gabor target or mask; (B) the large L grating target or mask; (C) an SL example with the L mask at 25% contrast and the added S test at 50%; (D) the in-phase annular mask; (E) the in-phase annular mask at 25% contrast with the added S test at 50%; (F) as in (E) but the annular mask is 180° out of phase.
Figure 2
 
Examples of stimulus templates used in the experiments. (A) the small S Gabor target or mask; (B) the large L grating target or mask; (C) an SL example with the L mask at 25% contrast and the added S test at 50%; (D) the in-phase annular mask; (E) the in-phase annular mask at 25% contrast with the added S test at 50%; (F) as in (E) but the annular mask is 180° out of phase.
The Michelson contrast of a grating is defined conventionally as: c = (LmaxLmin)/(Lmax + Lmin), where Lmax and Lmin are the brightest and darkest pixels in a sinusoidal grating. In this paper, we refer to contrast as “dB attenuation from the maximum contrast.” 
The observers were asked to discriminate a mask stimulus from a composite stimulus of mask + target—that is, they had to detect the target in the presence of a mask. In order to generate the mask + target stimuli, the separate mask and target stimuli were presented on alternate frames of the display at a frame rate of 120 Hz, and since any frame rate over 60 Hz would generate perceptual fusion, observers perceived the mask and target simultaneously and superposed. Using alternate frames allowed us to precisely define the contrast of the low-contrast test stimulus even when it was present with a high-contrast mask; the contrasts of mask and test were set with separate lookup tables for each, giving each component of the stimulus the maximum gray-level resolution (256 levels). In the case where the mask was presented alone, it was presented alternately with a blank screen at mean luminance. As a result, the actual contrast of the mask and the contrast threshold were halved. 
Observers
The observers were JW and MC. JW, a male student in his early 20s with normal vision, was naive to the purpose of the experiment. MC is one of the authors. The research was carried out in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki), and we obtained informed consent from both observers. 
Procedure
The protocol for the discrimination experiments is similar to that described in Chirimuuta and Tolhurst (2005). The experiments used a modified two alternative forced-choice (2AFC) procedure. Unlike the traditional 2AFC, in our experiments, observers were presented with three temporal intervals (Tolhurst & Barfield, 1978). The mask was presented in all three intervals (each lasting 100 ms). The target (along with the mask) was presented either in the first or third interval. The second interval only ever contained the mask (and this was known by the observers) to remind the observer of the appearance of the mask alone. This was not strictly necessary in the experiments reported here since it was clear that the target was present in the interval with higher overall or local grating contrast. We retained this protocol, however, so that we might be able to relate the present results to other experiments (especially involving natural image stimuli) where the observer can clearly tell that the two stimulus intervals looked different but cannot reliably say which is “correct” (e.g., Lauritzen, Pelah, & Tolhurst, 1999; To, Chirimuuta, Turnham, & Tolhurst, 2012; Tolhurst et al., 2010). There was no fixation point, and observers were free to look at any part of the stimulus. Observers were asked to enter via keyboard press whether they thought that the target was shown in the first or third interval. 
Within any one experimental session, seven staircases were randomly interleaved, representing seven masking contrasts (at 10 dB intervals) for one of the five test/mask combinations listed above. In another session, the observer would set thresholds for seven other masking contrasts (5 dB shifted from the first session) for the same test/mask condition. The observer always knew that the test and mask would have fixed geometry during a session. The staircases adjusted the contrasts of the targets to bring them near to the discrimination thresholds (75% correct in 2AFC). After 100 trials of each of the seven staircases, the threshold for each stimulus was determined by maximum likelihood estimation (MLE) fitting of a cumulative normal fitted to the overall psychometric function given by the number of correct responses versus the number of presentations of each stimulus contrast. Each observer did each experiment twice, and we have finally averaged the four resulting thresholds together for each condition for the model fitting. Typically, estimated standard errors of the threshold estimates were 0.5–1.5 dB. 
The visual difference predictor multineuronal model
Our model is similar to the one described in detail by To et al. (2010) and Tolhurst et al. (2010), where the psychophysical and neurophysiological justification is given for the various steps. It follows the idea of Watson and Solomon (1997). The visual difference predictor (VDP) calculates how each of many V1 neurons responds to two different stimuli; it then computes, for each neuron, the difference in response to the two stimuli, and pools those many differences by Minkowski summation to give a single number, which is a postulated to be a measure of how different those stimuli would seem to an observer (see Discussion). 
Linear stage and contrast
The first stage is to convolve the luminance profile of each monochrome stimulus with 60 Gabor-function receptive fields (“simple cells”) with about one octave spatial-frequency bandwidth. The stimulus and each field are represented as 256 × 256 pixels. At each of six orientations (vertical and subsequent 30° steps), there are five spatial frequencies (0.67, 1.33, 2.67, 5.33, 10.66 c/°), each with even- and odd-symmetric fields. These orientations and frequencies are centered on the prime orientation and frequency of all the stimuli in this study. In reality, because all our stimuli are based on vertical sinusoids of 2.67 c/°, detection will be determined by the few channels close to this orientation and spatial frequency. In our previous model, the Gabor receptive fields were elongated by a factor of 1.5 along the long axis of the sinusoid carrier stripes based on the findings in electrophysiological study conducted by Tolhurst and Thompson (1981); an example field is shown in Figure 3A. However, in the present study, we allowed the aspect ratio of the fields to be an extra free parameter in the fits, searching for the best ratio to describe our results. This changes the sharpness of the orientation tuning of the filters without much affecting their spatial frequency tuning (about one octave full width at half height). The linear response of each neuron is divided by the local luminance to give a contrast response. This is then weighted by a typical observer's sensitivity to sinewave gratings of that neuron's optimal orientation and frequency. In the first versions of the model, we assumed that the observers fixated on the center of the stimulus and so the responses were also weighted by retinal eccentricity; from the center of the stimulus, sensitivity falls off by a factor 10 for every 40 cycles of the neuron's optimal spatial frequency (Robson & Graham, 1981). However, we removed this operation in the final model version when we recognized that in the case of experiment with large targets, observers' attention (and gaze) may have spread over a larger area and that eccentricity weightings might be inappropriate. This is further discussed in the Results and Discussion sections. 
Figure 3
 
(A) Gray-level representation of the form of a Gabor modeled cosine receptive field optimally responsive to vertical gratings of 2.67 c/°. The field is elongated by a factor of 1.5 along the direction of the sinusoidal stripes. The field is shown as a 256 × 256 pixel “image,” representing 6° square in the model. (B) At the same spatial scale, a gray-level representation of a surround suppressive annulus (Equation 2) for a neuron like (A) with optimal spatial frequency 2.67 c/°; the radius (radf) of the annulus in this example is 1 period (i.e., 1/2.67°) similar to the best-fitting versions of the models.
Figure 3
 
(A) Gray-level representation of the form of a Gabor modeled cosine receptive field optimally responsive to vertical gratings of 2.67 c/°. The field is elongated by a factor of 1.5 along the direction of the sinusoidal stripes. The field is shown as a 256 × 256 pixel “image,” representing 6° square in the model. (B) At the same spatial scale, a gray-level representation of a surround suppressive annulus (Equation 2) for a neuron like (A) with optimal spatial frequency 2.67 c/°; the radius (radf) of the annulus in this example is 1 period (i.e., 1/2.67°) similar to the best-fitting versions of the models.
Nonspecific suppression, or contrast normalization
At each location (x, y) in the stimulus, we calculate a nonspecific suppressing signal (Heeger, 1992) by summing the quasilinear contrast responses of the 60 fields exactly centered at that point (across frequency f, orientation o, and symmetry s), each raised to a power q:  
\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\begin{equation}\tag{1}{N_{x,y }}= \mathop \sum \limits_{f = 1}^{f = 5} \mathop \sum \limits_{o = 1}^{o = 6} \mathop \sum \limits_{s = 1}^{s = 2} {\left| {{C_{x,y,f,o,s}}} \right|^q}\end{equation}
 
This one nonspecific signal will suppress the responses of all 60 fields at the location equally. 
Orientation-specific surround suppression
We model surround suppression as if it is strictly specific for orientation (Blakemore & Tobin, 1972; Petrov et al., 2005) and spatial frequency although, in the present modeling of relatively narrow-band stimuli centered on just one sinusoid, the different tuning of contrast normalization and surround suppression is barely important. For a receptive field centered at a given (x, y) location, the surround strength is modeled as a radially symmetric annulus centered on that point (see an example in Figure 3B):  
\begin{equation}\tag{2}surround\;strengt{h_f} = d \cdot {e^{- d^{2}/(2.rad^2_f)}} \end{equation}
where d is distance from the center of the field, and radf is the radius of the annulus, which is set to be directly proportional to the period of the center spatial frequency of the receptive field. The function was then normalized to have unit volume (i.e., the volume under the three-dimensional curve was one). For each of the 30 orientation-frequency bands, we first take the root mean square (RMS) of the odd- and even-symmetric field responses (see below for two versions of the model) at each point across the whole stimulus. Then, these RMS values are raised to a power r, before being convolved with the annulus appropriate to that spatial frequency. This gives 30 different maps of surround suppression, Sx,y,f,o. It is important to note the RMS of odd- and even-fields means that we model surround suppression as if it has the same strength whatever the phase of a surround suppressing sinusoid. Watson and Solomon (1997) had a single suppressive mechanism that was broadly tuned for stimulus orientation but which was not just confined to a single point in the field.  
Sigmoidal transducer function
In our original model (To et al., 2010), the responses of each of the 3.9 million or so “neurons” in the model were raised to power p, and were finally subjected to the two nonlinear suppressive effects by division at the same time (parallel model), using a modified version of the Naka-Rushton equation. In the case of the parallel model, the final response of the field at location (x, y), frequency f, orientation o, and symmetry s is:  
\begin{equation}\tag{3}respons{e_{x,y,f,o,s}} = {{sign({c_{x,y,f,o,s}})\cdot{{\left| {{c_{x,y,f,o,s}}} \right|}^p}} \over {1 + {W_N}\cdot{N_{x,y}} + {W_S}\cdot{S_{x,y,f,o}}}}{\rm }\end{equation}
where sign extracts the sign (+ or −) of Display Formula\({c_{x,y,f,o,s}}\), WN and WS are weights, and the calculation of Nx,y and Sx,y,f,o involves raising response values to powers q and r respectively, as described above. The surround suppressive signal S is calculated from the same quasilinear contrast responses as the normalizing signal N. Equations of this form are intended to replicate the psychophysical sigmoidal transducer function (Legge & Foley, 1980), which probably represents some pooling of the behavior of a population of neurons each with only a limited dynamic contrast range (Chirimuuta & Tolhurst, 2005; Clatworthy, Chirimuuta, Lauritzen, & Tolhurst, 2003; May & Solomon, 2015).  
In the case of our new sequential model, the intermediate normalized response (response1) is calculated based on the normalizing signal only, and then the surround suppressive signal is calculated from these normalized responses. There are two successive Naka-Rushton equations:  
\begin{equation}\tag{4}response{1_{x,y,f,o,s}} = {{sign\left( {{c_{x,y,f,o,s}}} \right)\cdot{{\left| {{c_{x,y,f,o,s}}} \right|}^{{p_1}}}} \over {1 + {W_N}\cdot{N_{x,y}}}}{\rm }\end{equation}
where WN is the weight of normalization, and the calculation of Nx,y involves raising response values to the power q, as described above. Then, this normalized response is subjected to surround suppression using normalized surround responses, and the final response (response2x,y,f,o,s) for location (x, y), frequency f, orientation o, and symmetry s, is computed using:  
\begin{equation}\tag{5}response{2_{x,y,f,o,s}} = {{sign(response{1_{x,y,f,o,s}})\cdot{{\left| {response{1_{x,y,f,o,s}}} \right|}^{{p_2}}}} \over {1 + {W_S}\cdot{S_{x,y,f,o}}}}\end{equation}
where WS is the weight of surround suppression, and the calculation of Sx,y,f,o from response1 involves raising response1 values to power r, as described above.  
The weight and power parameters are the same for all orientations and spatial frequencies. Note that the original parallel model has only one parameter p, whereas the new sequential model has two separate p parameters (p1 and p2). 
Final pooling of all the difference cues
We finally have a model of the responses or outputs of all the neurons to a given stimulus (e.g., the mask alone). The process is repeated for the comparison image (e.g., mask + test), and we subtract the model outputs for the two stimuli neuron-by-neuron through the five spatial frequencies, six orientations, and two symmetries. The many visibility cues across x, y, frequency, orientation, and symmetry are combined into a single value by Minkowski summation with power m (Watson & Solomon, 1997). The n individual visibility cues are raised to the power m, summed and the mth root taken.  
\begin{equation}\tag{6}overall\;difference = {\left[ {\mathop \sum \limits_i^{\rm{n}} {{\left( {difference\;cu{e_i}} \right)}^m}} \right]^{{1 \over m}}}\end{equation}
 
Finding the parameters of the best-fitting model
For any given version of the model and for any set or subset of the five dipper experiments, we sought the set of model parameters that gave the best fit between the model's prediction and the experimental measurement of the discrimination thresholds. The parallel model contained eight to nine parameters to fit (depending on version), whereas the sequential model encompasses nine to 10 parameters (Table 1). An iterative procedure (fminsearch in Matlab, The Mathworks, Natick, MA) sought to minimize the summed squared error between experimental threshold values in dB and the model predictions of each threshold. The final solutions were generally affected by the starting guesses and, ideally, we would have made very many iterative searches with many very different starting guesses. However, it was not feasible to do this systematically since, depending on the vintage of the computer, and the details of the particular fit, each iteration might take many minutes or some hours, and convergence after several hundred iterations might take weeks! The fits described here have been replicated with several different starting guesses, and for several slightly different versions of the model (which we do not list in detail). 
Table 1
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 4. Notes: The parameters are found in the equations in the Methods. Contrast sensitivity function (CSF) weight is a parameter that adjusts the sensitivity of all the neurons so that the model dippers slide along the diagonal, in case the observer's CSF is not quite the same as the “typical” weights we have assigned. The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. These three fits were carried out with the omission of the eccentricity weighting step (see text). The fit for Figure 4C has one fewer parameter than Figure 4B because radf was fixed at a value near zero.
Table 1
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 4. Notes: The parameters are found in the equations in the Methods. Contrast sensitivity function (CSF) weight is a parameter that adjusts the sensitivity of all the neurons so that the model dippers slide along the diagonal, in case the observer's CSF is not quite the same as the “typical” weights we have assigned. The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. These three fits were carried out with the omission of the eccentricity weighting step (see text). The fit for Figure 4C has one fewer parameter than Figure 4B because radf was fixed at a value near zero.
Each iteration began with a calculation of a “threshold criterion” calculated from the model with the present set of parameters. The model calculated the difference between a simulated midgray blank screen and a small Gabor vertical grating with spatial frequency 2.67 c/°. The grating contrast was just at the average contrast detection threshold (with a mask of 0% contrast) that we had measured during the SS, SL, and annular mask dipper experiments. This criterion value is postulated to represent the magnitude of any stimulus difference that should just be at threshold whatever the two stimuli to be compared. The csf-weight fitting parameter (Table 1) deals with the possibility that the experimental estimate of control threshold is too high or too low compared to the bulk of the other dipper measurements leading to a systematic shift of the model dippers compared to the experimental ones. This parameter should ideally be 1.0 and indeed for the most successful models, it was close to 1.0. 
Then, for each of the 14 masking contrasts, for each of the dippers to be fit, we sought what contrast of added test stimulus would make the VDP difference between mask and mask + test equal to the threshold criterion value. We guessed the threshold, and depending upon whether the VDP output was now less or greater than the criterion, we raised or lowered test contrast. The “staircase,” coupled with linear interpolation between the sampled values, reached a tolerable solution within four or five attempts. 
Akaike Information Criterion
To compare the performances of different versions of the models, we calculated the Akaike coefficient from the residual sum of squares, whilst taking into account the number of parameters. Although the actual Akaike Information Criterion (AIC) number for any one fit is not very informative, the delta AIC (ΔAIC) between two models weighs up a difference in residual sums of squares against any difference in the number of parameters, and can give some indication of the relative success of different models.  
\begin{equation}\tag{7}AIC = n.ln\left( {{{ssq} \over n}} \right) + 2k + \left( {2k.\left( {k + 1} \right)} \right)/\left( {n - k - 1} \right)\end{equation}
 
Where n is the number of data points to fit, k is one more than the number of model parameters and ssq is the residual sum of squares deviation between model and data. 
Simulating single neuron electrophysiology experiments
Once we had found the best-fit model of all, we simulated how a single “neuron” in that model would have responded to the kinds of visual stimuli used in key single-neuron electrophysiology experiments (Cavanaugh et al., 2002; Sceniak et al., 1999). These stimuli consisted of circular patches of grating at the best orientation and spatial frequency centered on the neuron's receptive field; the diameter of the patch and the contrast were varied systematically. The response of a single neuron was simulated by calculating how the whole VDP network of 3.9 million “neurons” would have responded to a given grating patch (vertical, 2.67 c/°, centered in the 256 × 256 pixel grid), and then we took the single response of the even-symmetric neuron tuned to vertical 2.67 c/° gratings, whose field was centered in the center of the grid. 
Results
Contrast masking with Gabor and grating stimuli
The circles in all three panels of Figure 4 show the results of three of our five grating contrast discrimination experiments, following the experimental design of Meese (2004); the curves are fits from different versions of our VDP model. The blue circles show the contrast discrimination dipper for Meese's LL condition—that is, a “large” test square of sinusoidal grating masked by an equally “large” square of the same kind of grating. The green circles show the dipper for the SS condition: a “small” Gabor patch of the grating (Figure 2A) masked by an equally “small” Gabor patch. Finally, the red circles are for the SL condition: a small, Gabor patch masked by the large square grating. The unmasked detection thresholds (circles nearest to the left hand y-axis) for the small Gabor patches are substantially higher than the unmasked threshold for the large grating, as would be expected from the hypothesis that a larger stimulus would excite more neurons and that probability summation would increase sensitivity (Robson & Graham, 1981). Characteristically (Meese, 2004), the SS data (green) show a more pronounced dip than the SL (red) and they cross to the right of the LL data (blue). 
Figure 4
 
Contrast discrimination dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations, like Meese (2004). The averages of the normalized data from the two observers are shown in the three panels. The contrast is expressed as dB attenuation, from the highest contrast available. Given that the mask and test were presented on alternate frames, this highest contrast is a Michelson contrast of 50%. The points shown at −64 dB masking contrast were actually measured without any masker present. (A) The curves are the best fit parallel version of the V1 model to the whole 42 data points, including a model of surround suppression (similar to the model of To et al., 2010). Details of all the model fits and the parameter values are given in Table 1. (B) The curves are the best-fit sequential version of the V1 model (our new model) to the whole 42 data points, including a model of surround suppression. (C) The curves are the best-fit version of a V1 model without including any surround suppression.
Figure 4
 
Contrast discrimination dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations, like Meese (2004). The averages of the normalized data from the two observers are shown in the three panels. The contrast is expressed as dB attenuation, from the highest contrast available. Given that the mask and test were presented on alternate frames, this highest contrast is a Michelson contrast of 50%. The points shown at −64 dB masking contrast were actually measured without any masker present. (A) The curves are the best fit parallel version of the V1 model to the whole 42 data points, including a model of surround suppression (similar to the model of To et al., 2010). Details of all the model fits and the parameter values are given in Table 1. (B) The curves are the best-fit sequential version of the V1 model (our new model) to the whole 42 data points, including a model of surround suppression. (C) The curves are the best-fit version of a V1 model without including any surround suppression.
The curves in Figure 4 show best-fit V1 based models to the 42 data points. These models allowed receptive field aspect ratio to vary and they discarded the eccentricity-weighting step, for reasons that we shall discuss later. For both our old parallel (Figure 4A, see Equation 3) and our new sequential model (Figure 4B, see Equations 4 and 5), the different forms of the three dipper functions are generally in agreement with those reported by Meese (2004). In particular, the rising limbs of the dippers at high masking contrasts converge and, contrary to any simple Weber's law interpretation, they are not straight, but have inflections at masking contrasts of about 20 dB mask attenuation. The models both describe the characteristic differences in the “dips” in the data for SS (red circles) and SL (green circles). Table 1 shows that the sequential model predictions (Figure 3B) gave slightly lower sum of the squared residuals (SSQ) than the parallel model (SSQ 1.43 dB2 per datum compared to 1.49). However, the AIC suggests that this slightly better fit is outweighed by the extra parameter needed in the sequential model. Either way, the residual errors are compatible with the estimated standard errors of the experimental measurements. 
Meese (2004) interpreted such results as showing that the channels detecting a small patch of grating were subject to suppression from channels responsive to any surrounding masking grating. He successfully fitted the three dippers with a model of the nonlinear transducer function (Legge & Foley, 1980) that included just a few scalar variables: the contrast of the central part of the test stimulus, the contrast of the central part of the masking stimulus, and the contrast of the surrounding part of the mask stimulus. Our models, considerably more elaborate, also include surround suppression as one feature that tries to emulate V1 neurophysiological findings. Figure 3C shows the same experimental measurements, but the curves represent a different model fit, one where we have omitted the explicit surround suppression that each neuron receives from the area surrounding its receptive field. The only nonlinear suppression is of nonspecific suppression (contrast normalization; Heeger, 1992). It can be seen that this model gives a very poor fit to the experimental measurements. The SSQ rises to 6.08 dB2 per datum, and this model entirely misses the differences in form between the various dippers. It may also be noted that the Minkowski summation exponent in this model (Table 1) is very high (see Discussion): the model has effectively been reduced to considering the responses of only a single-neuron (presumably the one in the very center of the stimuli). In the example shown, we effectively removed surround suppression by constraining the radius of the surround annulus to be close to zero. This reduced the number of parameters in the fit by only one. We got a very similar result by removing the surround suppression step altogether, leaving out Equation 5; this reduced the number of parameters by four. Overall, these results suggest that, contrary to Petrov et al.'s (2005) suggestions, surround suppression at the neuronal level is needed to model foveal dipper functions. 
Masking by annular surround gratings
While Figure 4 has shown our results for stimulus configurations like Meese (2004), we also examined the masking for Gabor (“S”) patches of the grating centered in annular masks (Figure 2D). The experimental results for the SS, SL, and LL experiments are shown again by the circles in Figure 5A, C, and E. The results for the Gabor patch in the annular masks are shown in Figure 5B, D, and F; the mask could be in spatial phase with the central test Gabor (orange circles) or it could be 180° out of phase (black circles). The curves in Figure 5A and B show the fits of a single model applied to all five masking experiments at once; Figure 5C through F show the fits of two further models to the 70 data. The details of the fits are shown in Table 2
Figure 5
 
This figure shows all five dipper experiments for grating stimuli, the thresholds and the overall best-fit graphs. (A, C, and E) Contrast dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations replotted from Figure 4. (B, D, and F) a Gabor S test with in-phase masking annulus (orange circles), and with 180° out of phase masking annulus (black circles). The curves show models which were best fit for all five dippers (70 data) at once; they were different versions of the sequential model and they incorporated surround suppression. (A–B) the model had fixed aspect ratio of 1.5 and used eccentricity weighting. (C–D) the aspect ratio became an extra model parameter, free to vary (it became 1.20; Table 2), and eccentricity weighting was still used. (E–F) eccentricity weighting was not applied, and aspect ratio was free to vary (it became 1.21).
Figure 5
 
This figure shows all five dipper experiments for grating stimuli, the thresholds and the overall best-fit graphs. (A, C, and E) Contrast dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations replotted from Figure 4. (B, D, and F) a Gabor S test with in-phase masking annulus (orange circles), and with 180° out of phase masking annulus (black circles). The curves show models which were best fit for all five dippers (70 data) at once; they were different versions of the sequential model and they incorporated surround suppression. (A–B) the model had fixed aspect ratio of 1.5 and used eccentricity weighting. (C–D) the aspect ratio became an extra model parameter, free to vary (it became 1.20; Table 2), and eccentricity weighting was still used. (E–F) eccentricity weighting was not applied, and aspect ratio was free to vary (it became 1.21).
Table 2
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 5. Notes: The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. The fit for Figure 5A and B has one fewer parameter because aspect ratio was fixed at 1.5.
Table 2
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 5. Notes: The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. The fit for Figure 5A and B has one fewer parameter because aspect ratio was fixed at 1.5.
It is noteworthy that the in-phase annular mask (orange circles in Figure 5B, D, and F) evoked little threshold elevation except at the highest masking contrasts, superficially consistent with the suggestion that there is little or no surround suppression in foveal vision (Meese, Challinor, Summers, & Baker, 2009; Petrov et al., 2005). Thus, the “surround” masking grating gave little masking of the “central” patch, even though our modeling suggests that surround suppression is a necessary element. However, the out-of-phase mask (black circles) did produce masking. 
The curves in Figure 5A and B show a model that best fit all five masking experiments (70 data) at once. This was a version of the sequential model which fixed the receptive field aspect ratio at 1.5 (as in To et al., 2010) following our interpretation of some neurophysiological data (Tolhurst & Thompson, 1981). The model also included sensitivity weighting for eccentricity from our original study on natural images (see Methods). It is not a particularly good fit to the SS, SL, and LL data (Figure 5A) compared to Figure 4A and B largely because the model now has to accommodate the two annular masking experiments as well; the details of the fit are given in Table 2. The same model is also only a modest fit to the annular mask data (Figure 5B). Overall, this sequential model (AIC = 95.40) despite its extra parameter is now marginally better than an equivalent parallel model (not illustrated, AIC = 98.24). 
The model, although only a modest fit to the annular mask data, does at least mimic the phase dependence of the annular masking. This is particularly interesting: The annular masks are supposed to show the effects of surround suppression and yet our model explicitly makes surround suppression phase-independent (see Methods). Perhaps, the phase dependence of the masking arises because the annuli not only cause surround suppression but also contribute to linear spatial summation in those many neurons whose classical receptive fields span the border between Gabor test and masking annulus (see e.g., Figure 1C). We therefore modified the model to allow the length (the aspect ratio) of the neurons' classical field to vary in order to obtain a better fit. Changing the length should change the number of neurons whose fields span the test/mask border and, perhaps, change the balance between nonlinear phase-independent surround suppression and linear phase-dependent summation. Figure 5C and D shows the fits of a sequential VDP model where aspect ratio is an extra free parameter. The best fit was obtained with an aspect ratio of 1.2 (cf. Foley et al., 2007) compared to the fixed value of 1.5 with which we started. The new fit (AIC = 89.47) is a good improvement (fixed aspect AIC = 95.4) and, as hoped, the fit to the annular mask data looks much improved. For comparison, an equivalent parallel model with free aspect ratio (not illustrated) also improved and now gave a slightly better fit than the sequential (AIC = 85.88 compared to 89.47). 
Inspection of Figure 5A and C suggests that the weakest part of the fitting is for the LL data (blue circles). The fitted line systematically misses much of the “dip.” Of the five experiments, the LL is unique in that the test pattern is a full size grating rather than a smaller Gabor. In the Discussion, we consider whether this difference in geometry might somehow change the task or the observer's approach. Within our modeling, one feature that might affect the L test stimulus more than all the S Gabors is the eccentricity weighting that might reduce the contribution of neurons serving the fringes of the L stimulus and that play no role in detecting the S Gabors. As a final development of our sequential VDP model, we eliminated the eccentricity weighting of sensitivity. Figure 5E and F show that the fit to the LL (blue circles) is much improved; aspect ratio is also free to change and so the annular masks (Figure 5F) are also well fit. Overall, AIC has fallen considerably to 65.67. An equivalent parallel model (not illustrated) also improved but not to the same extent (AIC = 79.0). 
Simulating single neurons in the model
The best-fit model of all (Figure 5E and F) has 10 parameters that had been optimized. It is difficult to see how the modeling would be affected if the numerical values given in Table 2 (rightmost column) of many of those parameters were to change; it is difficult to interpret their individual contribution to the good fit. We have, therefore, looked to see whether those parameter values, which are good for the psychophysics, are also “reasonable” in the sense that they mimic single neuron behavior as has been seen in neurophysiological experiments. Using the parameters of the best-fit model (Figure 5E and F) with its sequential coding of normalization and surround suppression, we simulated how a single “neuron” in the center of the display would respond to simple grating stimuli, as in experiments by, for example, Cavanaugh et al. (2002) and Sceniak et al. (1999). This neuron had optimal orientation and spatial frequency like the stimuli in our experiments, and the simulated gratings had these optimal parameters. The responses of all neurons in the whole array of millions were calculated, and only the response of this one neuron in the “center” of the array was investigated. 
Figure 6A shows how that “neuron” responded to sinusoidal gratings as a function of contrast. The response curve (which is affected by four power parameters and two weights) is sigmoidal, as is typical of cortical neurons (Albrecht & Hamilton, 1982; Tolhurst, Movshon, & Thompson, 1981). We fitted a simple Naka-Rushton (hyperbolic ratio) with different numerator and denominator powers to this curve (see inset to Figure 6A); the fitted powers are a little steeper than the average for single neurons (Albrecht & Hamilton, 1982) but are within the range observed; they are also a little steeper than those in Foley's (1994) and Legge and Foley's (1980) nonlinear transducers. 
Figure 6
 
Simulations of the behavior of a single neuron in the best-fit model to sinusoidal gratings. (A) Magnitude of response to gratings of optimal orientation, frequency (one cycle in 16 pixels) and phase as a function of contrast. The line is the Naka-Rushton equation inset. (B) Responses to disks of optimal grating centered on the receptive field of the neuron, as a function of the radius of the disk. Responses at two very low contrasts are shown. Responses are normalized to the biggest response at C = 0.009. The red line is the form of response expected if there had been no surround suppression. (C) Similar to (B) but at two slightly higher contrasts. Responses are normalized within each curve. GSF and surround radius are taken after Cavanaugh et al. (2002).
Figure 6
 
Simulations of the behavior of a single neuron in the best-fit model to sinusoidal gratings. (A) Magnitude of response to gratings of optimal orientation, frequency (one cycle in 16 pixels) and phase as a function of contrast. The line is the Naka-Rushton equation inset. (B) Responses to disks of optimal grating centered on the receptive field of the neuron, as a function of the radius of the disk. Responses at two very low contrasts are shown. Responses are normalized to the biggest response at C = 0.009. The red line is the form of response expected if there had been no surround suppression. (C) Similar to (B) but at two slightly higher contrasts. Responses are normalized within each curve. GSF and surround radius are taken after Cavanaugh et al. (2002).
Figure 6B and C show how the responses of this “neuron” depended on the radius of a disk of the grating at four different contrasts. At the lowest contrast (Figure 6B, open circles), the response rises to a maximum value and then plateaus at a radius consistent with the hypothesis that there is no surround suppression (form of the red curve). However, at slightly higher contrasts (Figure 6B, filled circles), the response falls after this radius is exceeded, implying surround suppression. Figure 6C shows that, as the stimulus contrast is raised further, the amount of the suppression increases and it begins to become apparent at a smaller radius: grating summation field (GSF; Cavanaugh et al., 2002) moves to the left. Thus, the threshold contrast for suppression is higher than the general response threshold and, importantly and surprisingly, overall receptive field size seems to get smaller as contrast is raised, as is found in real neurons (Sceniak et al., 1999; Tailby, Solomon, Pierce, & Metha, 2007). In the model neuron, the ratio of surround radius to GSF is in the range between two and three depending on contrast, compatible with many studies of V1 neurons (Cavanaugh et al., 2002; Levitt & Lund, 2002; Sceniak et al., 1999). 
Discussion
The purpose of this study was to design computational models of how human observers discriminate the contrast of sinusoidal gratings masked by gratings of various geometries. These models (involving very many neurons) implemented known physiological mechanisms that are involved in visual processing up to primary visual cortex. Our analyses revealed that both contrast normalization and surround suppression are important: A model without surround suppression (Figure 4C) gave a very poor fit. One key question was the order in which the two nonlinearities (normalization and surround suppression) were applied. We compared the performance of a sequential model that applied normalization before surround suppression to that of a parallel model that applied both inhibitions simultaneously. The former (sequential) model most realistically fitted with single neuron studies of visual cortex (DeAngelis et al., 1994; DeAngelis et al., 1992; Durand et al., 2007; Henry et al., 2013; Li & Freeman, 2011; Li et al., 2006). We found that, despite the extra parameter in the sequential model, its predictions were generally superior, giving it a higher AIC than the parallel model (Figure 4A). However, the fits of the two models did not differ greatly. 
Petrov et al. (2005) reported that they could find no evidence of masking with a surround annulus in foveal vision. On the other hand, Meese et al. (2009) did report some low levels of surround suppression at high contrast, which is consistent with our finding that there is some masking with the in-phase annulus, but only at the highest masking contrasts. This seems to conflict with our modeling, which implies that there is surround suppression in foveal vision (Figure 4). However, when comparing data from the two annulus experiments (Figure 5), clear elevations of threshold were found when the mask was out of phase. Now, our models all had phase-invariant surrounds suppression mechanisms, yet the models predicted that the surrounding mask annulus would give phase variant results. This suggests that the surrounding annulus has two opposing effects: 
  1.  
    A phase-specific linear summation of the test and surround gratings driven by neurons near the edges of the patch (see e.g., neuron c in Figure 1). This facilitates detection for in-phase stimuli, while raising thresholds for out-of-phase gratings.
  2.  
    A phase-invariant nonlinear suppression driven by neurons at the center of the patch.
A surrounding masking grating does not purely cause surround suppression at the receptive-field level. Whether an annular mask raises the threshold or not would depend upon the balance between these two effects on individual neurons and on how the many neurons are differentially distributed within and at the borders of the test grating. The interaction between two effects would be consistent with Meese et al.'s (2009) suggestion that there could be a slight facilitation in the mid region of the dipper functions. To accommodate for these two types of opposing processes, we allowed the receptive field aspect ratio to change freely in relation to suppressive surround size. In our past models (To et al., 2011; To et al., 2010), we assumed that receptive fields were elongated by a factor of 1.5 along the long axis of the sinusoid carrier stripes, based on one neurophysiological study in cat visual cortex (Tolhurst & Thompson, 1981). Foley et al. (2007) found that slightly elongated fields proved a better fit to their psychophysical results than did the usual Gabor model with a circularly symmetric Gaussian envelope, reporting elongation factors more like 1.2 than our 1.5. Because of this disagreement, we decided to set receptive field aspect ratio as an extra free parameter in the present models. When we ran variants of the sequential model and allowed the aspect ratio to change, ratios in the region of 1.20 and 1.21 (see Table 2), closely matching Foley et al.'s (2007) values, yielded the best fits. 
We noted that there was a clear and consistent systematic error in the fit to the LL condition—that is, a large test grating masked by a large grating (blue circles)—the dipper did not go nearly deep enough and the magnitude of the difference in the unmasked thresholds between the S Gabor and the L grating was not well modeled. The only thing that distinguished the LL dippers from the others was the fact that the test stimulus was big; in the other four dippers, the test was a small diameter Gabor patch. It was possible that in the experiments where the targets were small, the discrimination task could be accomplished by focusing attention on the activity change within a small patch of neurons processing the center of the stimulus. However, in the separate experiment with a large target, one could direct attention to focus on differences in contrast over the whole larger area. Attentional demands in the two cases would therefore be different and the models should consider the responses of neurons in the center of the stimulus when the test stimuli were small Gabors, but should allow the extra contribution of neurons over the full extent of the large test stimulus. We did not explicitly model an attentional focus, but our original models (To et al., 2011; To et al., 2010) included a sensitivity weighting for eccentricity in which the responses of neurons were favored in the center of the stimulus, for all stimuli including the large one. By removing this weighting, we gave more weight to the more peripheral neurons responsive to the large stimulus, an approximation to the attentional focus we needed, allowing all neurons to contribute equally. After removing eccentricity weighting, the model fits (especially the sequential model) improved considerably. 
Our models contained very many neurons whose properties were governed by rather few parameters: eight to 10 freely changing in the best fits. We ended (Figure 5E and F) with a model that described five dipper experiments well, and we are now faced with trying to understand the numerical values finally assigned to those parameters. It is not easy to see the effect of any particular value for most parameters on the eventual fit and it is not easy to see whether these values are “reasonable.” We approached this issue by looking within the best-fit model at the behavior of a single “neuron,” as if we had modeled a classical single-neuron neurophysiology experiment. The models began by trying to model some behaviors of real single neurons, and after allowing free range of the model parameters to psychophysical experiments, the result turned out to be a good replication of some other behaviors of real neurons. For instance, the combination of parameters generated a response/contrast transducer function that was sigmoidal like that of real neurons. Furthermore, like real cortical neurons, the extent of the suppressive surround was about 2–3 times the diameter of the measured summation field (the net result of linear excitation and the nonlinear surround suppression) and the central summation field got smaller as contrast rose. Neither of these was explicitly included in the model steps but both were like the behavior of real neurons. This agreement with real neurophysiology is particularly gratifying since the surround radius parameter in our model had a very small best-fit value (half the size of the linear fields, rather than twice the size!). It must be remembered that the neurophysiological estimates were based on the overall behavior of the neuron, and could not explicitly separate “center” from “surround.” The model may specify separate component mechanisms, but an experiment will tap their combination. 
Most of the parameters in our models can be related back to the behavior of single V1 neurons directly or indirectly, but one parameter cannot: the Minkowski exponent. Our models generated millions of numbers representing the response magnitudes of millions of single neurons, and it was necessary to reduce these to just one decision variable in order to model a 2AFC psychophysical experiment! The decision rule that we adopted was hypothetical and has no corroboration in neurophysiology. We computed, for each neuron, the difference between the responses in the two intervals of the forced choice and then we combined the many differences with Minkowski summation. There could be other plausible combination rules, but Minkowski summation with a power of about 3–4 was very successful in a number of such V1-based studies on gratings and natural images (Foley et al., 2007; Rohaly et al., 1997; To et al., 2008, 2010; Tolhurst et al., 2010; Watson & Ahumada, 2005; Watson & Solomon, 1997). The arithmetic of Minkowski summation (with power 3 or 4) was originally part of a specific hypothesis—that is, pooling of neuronal responses takes place by probability summation and the Minkowski exponent represents the steepness of the particular “Quick” formula for the psychometric function (Graham, 1977; Quick, 1974; Robson & Graham, 1981); using the Minkowski rule with integer power was, computationally, an extremely efficient way of performing probability summation in an era of limited computer power (e.g., it replaced multiple multiplications with much quicker additions). Whether the probability summation hypothesis was correct or not, Minkowski summation generally seemed to work pragmatically. However, in our best-fit models, the Minkowski exponent turned out to be 8–10, very different from the usual values of 3–4. This raised the possibility that another combination rule would make our present models more consistent with their precursors. All we can say with certainty is that, if no neuron changes its response during the trial, then the observer has no cue to which interval contains the target! It is still an open question how we should deal with the usual case that some neurons do change their responses. 
Following the neurophysiology (e.g., DeAngelis et al., 1994; DeAngelis et al., 1992), we have modeled the within-field normalization as nonspecific for orientation and spatial frequency, but the surround suppression as being specific for both stimulus parameters. However, the stimuli in our present experiments were narrow band, so they could not constrain this differential aspect of the models. Future experiments could make the mask stimuli differ from the tests in terms of orientation and frequency. It would also be an interesting challenge to devise a psychophysical stimulus geometry that could act as a more critical test between the parallel model (our original) and our newer sequential model, which seems more consistent with neurophysiology. 
Acknowledgments
The research was funded by grants EP/E037097/1 and EP/E037372/1 from the Engineering and Physical Sciences Research Council (UK)/Defence Science and Technology Laboratory (Dstl) under the Joint Grants Scheme. MPST was employed on those grants. MC initially held an MRC studentship, and later by a post-doctoral fellowship from the Australian Research Council. 
Commercial relationships: none. 
Corresponding author: Michelle P. S. To. 
Address: Department of Psychology, Lancaster University, Lancaster, UK. 
References
Albrecht, D. G., & Hamilton, D. B. (1982). Striate cortex of monkey and cat: Contrast response function. Journal of Neurophysiology, 48, 217–237.
Baker, D. H., Meese, T. S., & Summers, R. J. (2007). Psychophysical evidence for two routes to suppression before binocular summation of signals in human vision. Neuroscience, 146 (1), 435–448.
Blakemore, C., & Tobin, E. A. (1972). Lateral inhibition between orientation detectors in the cat's visual cortex. Experimental Brain Research, 15 (4), 439–440.
Campbell, F. W., & Kulikowski, J. J. (1966). Orientational selectivity of the human visual system. Journal of Physiology, 187 (2), 437–445.
Carandini, M., Heeger, D. J., & Movshon, J. A. (1997). Linearity and normalization in simple cells of the macaque primary visual cortex. Journal of Neuroscience, 17 (21), 8621–8644.
Cavanaugh, J. R., Bair, W., & Movshon, J. A. (2002). Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology, 88 (5), 2530–2546, doi:10.1152/jn.00692.2001.
Chirimuuta, M., & Tolhurst, D. J. (2005). Does a Bayesian model of V1 contrast coding offer a neurophysiological account of human contrast discrimination? Vision Research, 45 (23), 2943–2959, doi:10.1016/j.visres.2005.06.022.
Clatworthy, P. L., Chirimuuta, M., Lauritzen, J. S., & Tolhurst, D. J. (2003). Coding of the contrasts in natural images by populations of neurons in primary visual cortex (V1). Vision Research, 43, 1983–2001.
DeAngelis, G. C., Freeman, R. D., & Ohzawa, I. (1994). Length and width tuning of neurons in the cats primary visual cortex. Journal of Neurophysiology, 71 (1), 347–374.
DeAngelis, G. C., Robson, J. G., Ohzawa, I., & Freeman, R. D. (1992). Organization of suppression in receptive fields of neurons in cat visual cortex. Journal of Neurophysiology, 68 (1), 144–163.
Durand, S., Freeman, T. C., & Carandini, M. (2007). Temporal properties of surround suppression in cat primary visual cortex. Visual Neuroscience 24 (5), 679–690, doi:10.1017/S0952523807070563
Foley, J. M. (1994). Human luminance pattern-vision mechanisms: Masking experiments require a new model. Journal of the Optical Society of America A: Optics, Image Science, and Vision, 11 (6), 1710–1719.
Foley, J. M., Varadharajan, S., Koh, C. C., & Farias, M. C. (2007). Detection of Gabor patterns of different sizes, shapes, phases, and eccentricities. Vision Research, 47 (1), 85–107.
Graham, N.V. (1977). Visual detection of aperiodic spatial stimuli by probability summation among narrowband channels. Vision Research, 17, 637–652.
Heeger, D. J. (1992). Normalization of cell responses in cat striate cortex. Visual Neuroscience, 9 (2), 181–197.
Henry, C. A, Joshi, S., Xing, D., Shapley, R. M., & Hawken, M. J. (2013). Functional characterization of extraclassical receptive field in macaque V1: contrast, orientation and temporal dynamics. Journal of Neuroscience, 33, 6230–6242.
Lauritzen, J. S., Pelah, A., & Tolhurst, D. J. (1999). Perceptual rules for watermarking images: A psychophysical study of the visual basis for digital pattern encryption. Proceedings of the SPIE: Human Vision and Electronic Imaging, 3644, 392–402, doi:10.1117/12.348460.
Legge, G. E., & Foley, J. M. (1980). Contrast masking in human vision. Journal of the Optical Society of America, 70 (12), 1458–1471.
Levitt, J. B., & Lund, J. S. (2002). The spatial extent over which neurons in macaque striate cortex pool visual signals. Visual Neuroscience, 19, 439–452.
Li, B., & Freeman, R. D. (2011). Neurometabolic coupling differs for suppression within and beyond the classical receptive field in visual cortex. Journal of Physiology, 589 (Pt 13), 3175–3190, doi:10.1113/jphysiol.2011.205039.
Li, B., Thompson, J. K., Duong, T., Peterson, M. R., & Freeman, R. D. (2006). Origins of cross-orientation suppression in the visual cortex. Journal of Neurophysiology, 96 (4), 1755–1764, doi:10.1152/jn.00425.2006.
May, K. A., & Solomon, J. A. (2015). Connecting psychophysical performance to neuronal response properties: I. Discrimination of suprathreshold stimuli. Journal of Vision, 15 (6): 8, 1–26, doi:10.1167/15.6.8. [PubMed] [Article]
Meese, T. S. (2004). Area summation and masking. Journal of Vision, 4 (10): 8, 930–943, doi:10.1167/4.10.8. [PubMed] [Article]
Meese, T. S., Challinor, K. L., Summers, R. J., & Baker, D. H. (2009). Suppression pathways saturate with contrast for parallel surrounds but not superimposed cross-oriented masks. Vision Research, 49 (24): 2927–2935, doi:10.1016/j.visres.2009.09.006.
Petrov, Y., Carandini, M., & McKee, S. (2005). Two distinct mechanisms of suppression in human vision. Journal of Neuroscience, 25 (38), 8704–8707, doi:10.1523/JNEUROSCI.2871-05.2005.
Quick, R. F. (1974). A vector-magnitude model of contrast detection. Biological Cybernetics, 16 (2), 65–67, doi:10.1007/BF00271628.
Robson, J. G., & Graham, N. (1981). Probability summation and regional variation in contrast sensitivity across the visual field. Vision Research, 21 (3), 409–418.
Rohaly, A. M., Ahumada, A. J.,Jr., & Watson, A. B. (1997). Object detection in natural backgrounds predicted by discrimination performance and models. Vision Research, 37 (23), 3225–3235.
Sceniak, M. P., Ringach, D. L., Hawken, M. J., & Shapley, R. (1999). Contrast's effect on spatial summation by macaque V1 neurons. Nature Neuroscience, 2, 733–739.
Schallmo, M. P., & Murray, S. O. (2016). Identifying separate components of surround suppression. Journal of Vision, 16 (1): 2, 1–12, doi:10.1167/16.1.2. [PubMed] [Article]
Tailby, C., Solomon, S. G., Pierce, J. W., & Metha, A. B. (2007). Two expressions of “surround suppression” in V1 that arise independent of cortical mechanisms of suppression. Visual Neuroscience, 24, 99–109, doi:10.1017/S0952523807070022.
To, M. P. S., Chirimuuta, M., Turnham, E., & Tolhurst, D. J. (2012). Modelling grating and bandpass natural-scene contrast-discrimination dippers. Perception, 41 ECVP Abstract Supplement, 222.
To, M.P.S., Gilchrist, I. D., & Tolhurst, D. J. (2015). Perception of differences in naturalistic dynamic scenes, and a V1-based model. Journal of Vision, 15 (1): 19, 1–13, doi:10.1167/15.1.19. [PubMed] [Article]
To, M. P. S., Gilchrist, I. D., Troscianko, T., & Tolhurst, D. J. (2011). Discrimination of natural scenes in central and peripheral vision. Vision Research, 51, 1686–1698.
To, M., Lovell, P. G., Troscianko, T., & Tolhurst, D. J. (2008). Summation of perceptual cues in natural visual scenes. Proceedings of Biological Sciences, 275 (1649), 2299–2308, doi:10.1098/rspb.2008.0692.
To, M. P., Lovell, P. G., Troscianko, T., & Tolhurst, D. J. (2010). Perception of suprathreshold naturalistic changes in colored natural images. Journal of Vision, 10 (4): 12, 1–22, doi:10.1167/10.4.12. [PubMed] [Article]
Tolhurst, D. J., & Barfield, L. P. (1978). Interactions between spatial frequency channels. Vision Research, 18 (8), 951–958.
Tolhurst, D. J., Movshon, J. A., & Thompson, I. D. (1981). The dependence of response amplitude and variance of cat visual cortical neurones on stimulus contrast. Experimental Brain Research, 41, 414–419.
Tolhurst, D. J., & Thompson, I. D. (1981). On the variety of spatial frequency selectivities shown by neurons in area 17 of the cat. Proceedings of the Royal Society of London B: Biological Sciences, 213 (1191), 183–199.
Tolhurst, D. J., To, M. P., Chirimuuta, M., Troscianko, T., Chua, P. Y., & Lovell, P. G. (2010). Magnitude of perceived change in natural images may be linearly proportional to differences in neuronal firing rates. Seeing Perceiving, 23 (4), 349–372.
Watson, A. B.,& Ahumada, A. J.,Jr. (2005). A standard model for foveal detection of spatial contrast. Journal of Vision, 5 (9): 6, 717–740, doi:10.1167/5.9.6. [PubMed] [Article]
Watson, A. B., & Solomon, J. A. (1997). Model of visual contrast gain control and pattern masking. Journal of the Optical Society of America A: Optics, Image Science, and Vision, 14 (9), 2379–2391.
Xing, J., & Heeger, D. J. (2000). Center-surround interactions in foveal and peripheral vision. Vision Research, 40 (22), 3065–3072.
Figure 1
 
Panel (A) schematically presents the structure of a V1 receptive field. The central classical field (green) is a direct summation field that increases neuronal activity when appropriately stimulated, while the surrounding area (red) is a suppressive zone. Stimuli presented here do not affect neuronal firing when presented alone, but they do suppress the responses elicited by simultaneous activation of the classical center. The so-called “surround” probably overlaps with the classical center so that there is an ambiguous overlap zone, which may preclude exact measurement of the dimensions of the classical field. Panel (B) shows a central target patch and a surround mask, both of which are stimuli that are commonly used in psychophysical contrast discrimination experiments. Panel (C) demonstrates how the target and mask are projected onto various small neuronal receptive fields a, b, c, d, and e.
Figure 1
 
Panel (A) schematically presents the structure of a V1 receptive field. The central classical field (green) is a direct summation field that increases neuronal activity when appropriately stimulated, while the surrounding area (red) is a suppressive zone. Stimuli presented here do not affect neuronal firing when presented alone, but they do suppress the responses elicited by simultaneous activation of the classical center. The so-called “surround” probably overlaps with the classical center so that there is an ambiguous overlap zone, which may preclude exact measurement of the dimensions of the classical field. Panel (B) shows a central target patch and a surround mask, both of which are stimuli that are commonly used in psychophysical contrast discrimination experiments. Panel (C) demonstrates how the target and mask are projected onto various small neuronal receptive fields a, b, c, d, and e.
Figure 2
 
Examples of stimulus templates used in the experiments. (A) the small S Gabor target or mask; (B) the large L grating target or mask; (C) an SL example with the L mask at 25% contrast and the added S test at 50%; (D) the in-phase annular mask; (E) the in-phase annular mask at 25% contrast with the added S test at 50%; (F) as in (E) but the annular mask is 180° out of phase.
Figure 2
 
Examples of stimulus templates used in the experiments. (A) the small S Gabor target or mask; (B) the large L grating target or mask; (C) an SL example with the L mask at 25% contrast and the added S test at 50%; (D) the in-phase annular mask; (E) the in-phase annular mask at 25% contrast with the added S test at 50%; (F) as in (E) but the annular mask is 180° out of phase.
Figure 3
 
(A) Gray-level representation of the form of a Gabor modeled cosine receptive field optimally responsive to vertical gratings of 2.67 c/°. The field is elongated by a factor of 1.5 along the direction of the sinusoidal stripes. The field is shown as a 256 × 256 pixel “image,” representing 6° square in the model. (B) At the same spatial scale, a gray-level representation of a surround suppressive annulus (Equation 2) for a neuron like (A) with optimal spatial frequency 2.67 c/°; the radius (radf) of the annulus in this example is 1 period (i.e., 1/2.67°) similar to the best-fitting versions of the models.
Figure 3
 
(A) Gray-level representation of the form of a Gabor modeled cosine receptive field optimally responsive to vertical gratings of 2.67 c/°. The field is elongated by a factor of 1.5 along the direction of the sinusoidal stripes. The field is shown as a 256 × 256 pixel “image,” representing 6° square in the model. (B) At the same spatial scale, a gray-level representation of a surround suppressive annulus (Equation 2) for a neuron like (A) with optimal spatial frequency 2.67 c/°; the radius (radf) of the annulus in this example is 1 period (i.e., 1/2.67°) similar to the best-fitting versions of the models.
Figure 4
 
Contrast discrimination dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations, like Meese (2004). The averages of the normalized data from the two observers are shown in the three panels. The contrast is expressed as dB attenuation, from the highest contrast available. Given that the mask and test were presented on alternate frames, this highest contrast is a Michelson contrast of 50%. The points shown at −64 dB masking contrast were actually measured without any masker present. (A) The curves are the best fit parallel version of the V1 model to the whole 42 data points, including a model of surround suppression (similar to the model of To et al., 2010). Details of all the model fits and the parameter values are given in Table 1. (B) The curves are the best-fit sequential version of the V1 model (our new model) to the whole 42 data points, including a model of surround suppression. (C) The curves are the best-fit version of a V1 model without including any surround suppression.
Figure 4
 
Contrast discrimination dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations, like Meese (2004). The averages of the normalized data from the two observers are shown in the three panels. The contrast is expressed as dB attenuation, from the highest contrast available. Given that the mask and test were presented on alternate frames, this highest contrast is a Michelson contrast of 50%. The points shown at −64 dB masking contrast were actually measured without any masker present. (A) The curves are the best fit parallel version of the V1 model to the whole 42 data points, including a model of surround suppression (similar to the model of To et al., 2010). Details of all the model fits and the parameter values are given in Table 1. (B) The curves are the best-fit sequential version of the V1 model (our new model) to the whole 42 data points, including a model of surround suppression. (C) The curves are the best-fit version of a V1 model without including any surround suppression.
Figure 5
 
This figure shows all five dipper experiments for grating stimuli, the thresholds and the overall best-fit graphs. (A, C, and E) Contrast dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations replotted from Figure 4. (B, D, and F) a Gabor S test with in-phase masking annulus (orange circles), and with 180° out of phase masking annulus (black circles). The curves show models which were best fit for all five dippers (70 data) at once; they were different versions of the sequential model and they incorporated surround suppression. (A–B) the model had fixed aspect ratio of 1.5 and used eccentricity weighting. (C–D) the aspect ratio became an extra model parameter, free to vary (it became 1.20; Table 2), and eccentricity weighting was still used. (E–F) eccentricity weighting was not applied, and aspect ratio was free to vary (it became 1.21).
Figure 5
 
This figure shows all five dipper experiments for grating stimuli, the thresholds and the overall best-fit graphs. (A, C, and E) Contrast dipper data for SL (red circles), SS (green circles), and LL (blue circles) grating configurations replotted from Figure 4. (B, D, and F) a Gabor S test with in-phase masking annulus (orange circles), and with 180° out of phase masking annulus (black circles). The curves show models which were best fit for all five dippers (70 data) at once; they were different versions of the sequential model and they incorporated surround suppression. (A–B) the model had fixed aspect ratio of 1.5 and used eccentricity weighting. (C–D) the aspect ratio became an extra model parameter, free to vary (it became 1.20; Table 2), and eccentricity weighting was still used. (E–F) eccentricity weighting was not applied, and aspect ratio was free to vary (it became 1.21).
Figure 6
 
Simulations of the behavior of a single neuron in the best-fit model to sinusoidal gratings. (A) Magnitude of response to gratings of optimal orientation, frequency (one cycle in 16 pixels) and phase as a function of contrast. The line is the Naka-Rushton equation inset. (B) Responses to disks of optimal grating centered on the receptive field of the neuron, as a function of the radius of the disk. Responses at two very low contrasts are shown. Responses are normalized to the biggest response at C = 0.009. The red line is the form of response expected if there had been no surround suppression. (C) Similar to (B) but at two slightly higher contrasts. Responses are normalized within each curve. GSF and surround radius are taken after Cavanaugh et al. (2002).
Figure 6
 
Simulations of the behavior of a single neuron in the best-fit model to sinusoidal gratings. (A) Magnitude of response to gratings of optimal orientation, frequency (one cycle in 16 pixels) and phase as a function of contrast. The line is the Naka-Rushton equation inset. (B) Responses to disks of optimal grating centered on the receptive field of the neuron, as a function of the radius of the disk. Responses at two very low contrasts are shown. Responses are normalized to the biggest response at C = 0.009. The red line is the form of response expected if there had been no surround suppression. (C) Similar to (B) but at two slightly higher contrasts. Responses are normalized within each curve. GSF and surround radius are taken after Cavanaugh et al. (2002).
Table 1
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 4. Notes: The parameters are found in the equations in the Methods. Contrast sensitivity function (CSF) weight is a parameter that adjusts the sensitivity of all the neurons so that the model dippers slide along the diagonal, in case the observer's CSF is not quite the same as the “typical” weights we have assigned. The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. These three fits were carried out with the omission of the eccentricity weighting step (see text). The fit for Figure 4C has one fewer parameter than Figure 4B because radf was fixed at a value near zero.
Table 1
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 4. Notes: The parameters are found in the equations in the Methods. Contrast sensitivity function (CSF) weight is a parameter that adjusts the sensitivity of all the neurons so that the model dippers slide along the diagonal, in case the observer's CSF is not quite the same as the “typical” weights we have assigned. The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. These three fits were carried out with the omission of the eccentricity weighting step (see text). The fit for Figure 4C has one fewer parameter than Figure 4B because radf was fixed at a value near zero.
Table 2
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 5. Notes: The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. The fit for Figure 5A and B has one fewer parameter because aspect ratio was fixed at 1.5.
Table 2
 
The parameters of the least-squares best-fitting V1 models for the graphs shown in Figure 5. Notes: The number of threshold data points contributing to each fit, the residual sum-of-squares (SSQ) error and the Akaike Information Criterion (AIC; Equation 7) for each fit is also shown. The fit for Figure 5A and B has one fewer parameter because aspect ratio was fixed at 1.5.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×