As Watson et al. (
1997) point out, visual masking has traditionally been considered from two perspectives. Masking using visual noise is assumed to arise because the noise increases the variance at the decision variable (output stage) of the detection process (Pelli,
1981; Pelli & Farrell,
1999). Thus, the masking is late acting and within channel. Masking using narrowband gratings, on the other hand, has been interpreted in terms of a contrast gain control (Heeger,
1992) in which populations of neurons tuned to different spatial characteristics inhibit each other. Here, masking occurs between channels, perhaps at a much earlier stage, or stages (Baker, Meese, & Summers,
2007; Freeman, Durand, Kiper, & Carandini,
2002). The gain control approach has been used extensively in grating detection (Foley,
1994), image processing (Rohaly, Ahumada, & Watson,
1997), and neural coding (Chirimuuta & Tolhurst,
2005) paradigms.
Since grating stimuli which are distant in the Fourier domain are not believed to activate common detecting mechanisms, the noise paradigm cannot accommodate cross-channel masking data, as a distant grating mask will not affect the decision variable. However, noise masks and grating masks can both be incorporated into the gain control framework. Narrowband gratings strongly activate a single inhibitory mechanism, whereas noise masks activate many mechanisms, each more weakly, causing suppression after a linear pooling process (Holmes & Meese,
2004).
The two main effects described by the LAM can be easily accommodated by a widely used gain control equation (Foley,
1994),
in which
C is the input (signal) contrast,
M is the mask contrast (be it noise or a grating),
Z is a saturation constant,
w is a weight determining the impact of the mask, and
p and
q determine the properties of the nonlinear transducer function. Detection threshold is reached when the increase in model response caused by adding the test exceeds a criterion value, given by an additional parameter
k. Although the model has several parameters (
p, q, Z, w, and
k), in practice some of these can be fixed at commonly used values. Here, we constrain the exponents to values used by Legge and Foley (
1980), who first proposed a model of this form (
p = 2.4,
q = 2).
With 3 free parameters (
Z, w, and
k), the model provides a good fit to the foveal data (
Figure 9). Increasing
Z results in an increase in detection threshold and converging masking functions, as seen for the peripheral data. This is the same behavior caused by an increase in
N int in the LAM. However, rather than attributing this change to internal noise (as the LAM does),
Z is most likely a physiological property of the detecting neurons (
Z corresponds loosely to the semisaturation constant in the Naka–Rushton equation (Naka & Rushton,
1966), much favored by single cell physiologists).
Decreasing the sensitivity parameter ( B) in the LAM raises thresholds and produces parallel, not convergent, masking functions. Interestingly, a comparable effect can also be achieved in the gain control model by varying the threshold criterion, k. It is noteworthy that k is thought to be proportional to the variance of late additive (internal) noise in the gain control model, yet produces very different behavior from the internal noise parameter ( N int) in the LAM.
Finally, by varying
w, the masking function can be shifted laterally on the contrast axis. This corresponds roughly to a “mismatched perceptual template” in more elaborate forms of the LAM (Huang, Tao, Zhou, & Lu,
2007; Lu & Dosher,
1999).
It is clear that models of the gain control form are well able to describe noise masking data. Furthermore, they offer a more accurate representation of the contrast transducer that produces the familiar within-channel dipper function for contrast discrimination (Legge & Foley,
1980). Dipper functions are shifted upwards and to the right by cross-channel grating masks (Foley,
1994), and the same behavior has also been shown using broadband noise masks (Henning & Wichmann,
2007; Pelli,
1981,
1985). The gain control model accommodates both of these findings.
The main practical use of the equivalent noise approach, and models derived from it, has been to ascribe a difference in thresholds to either a difference in sensitivity, or a difference in internal noise. From this analysis, we have concluded that the difference in thresholds with peripheral viewing is due to a change in internal noise level. However, in the gain control model, we can produce similar behavior by changing the value of Z. As this is merely a model parameter, can it be said to have any useful meaning?
The answer is yes. Let us assume that the value of
Z increases with eccentricity. This will produce an increase in thresholds, as seen in Part I (
Figure 2). However, due to the compressive nature of the nonlinearity at high contrasts, there will be little change in output with eccentricity at suprathreshold levels, as shown by the model response functions in
Figure 10A at low and high input contrasts (dashed line vs. solid line at the starred location).
This reflects an empirically well-established phenomenon known as contrast constancy, which occurs over changes in both spatial frequency and eccentricity (Cannon,
1985; Georgeson & Sullivan,
1975). The gain control model can thus describe both the increase in detection threshold and the fidelity of high contrast stimuli at different eccentricities or spatial frequencies by varying a single parameter,
Z (for spatial frequency, this most likely only applies to lower frequencies, where attenuation from optical factors is unimportant, i.e., below 10 cpd, Williams, Brainard, McMahon, & Navarro,
1994).
Figure 10B demonstrates this property of the model, which was originally described by Cannon and Fullenkamp (
1991). The CSF was determined by estimating values of
Z that produce detection thresholds at a range of spatial frequencies (based on the data of MAG from Georgeson & Sullivan,
1975). We then used the model to generate responses at 7 contrast levels for a single
Z value and calculated the matching contrasts at each spatial frequency (estimated
Z value) that produced the same magnitude of response. This produced the familiar flattening of matching functions at high contrasts, observed for both spatial frequency and peripheral viewing (Cannon,
1985).
This account also gives some insight into the behavior of 2nd order mechanisms. If the output of 1st order mechanisms, which is governed by
Equation 3, forms the input to 2nd order mechanisms, then performance should be largely unaffected by changes in the detectability of the carrier at low contrasts, once it is above threshold and in the compressive region of the transducer response curve (the star in
Figure 10A). Thus, second order sensitivity could be determined by constraints other than that of absolute sensitivity to the carrier, as we have found here empirically at fine spatial scales.