The oldest illusion known to lightness perception is called simultaneous lightness contrast. When two identical gray squares are placed on white and black backgrounds, respectively, the square on the white background appears darker than the one on the black background. The illusion was studied by Chevreul (
1839) and described by both Alhazen (
1883/1989) and the ancient Greeks.
In the modern era, simultaneous contrast gained significance by defying the early assumption that visual experience corresponds to local stimulation and showing the role of context. Since the time of Hering (
1874/1964; see also Mach,
1922/1959), the illusion has been attributed to “reciprocal interaction in the somatic visual field,” or what later was called lateral inhibition. Recording from a single facet of the limulus eye in Hartline, Wagner, and Ratliff (
1956), showed that when the intensity of light stimulating a facet is held constant, its rate of firing is inversely proportional to the intensity of light stimulating adjacent facets. The application of this finding to simultaneous contrast is straightforward. Without lateral inhibition, the two gray squares would produce equal rates of firing in the corresponding retinal areas. However, the bright light from the white background is thought to inhibit the firing rates of cells corresponding to the surrounded gray square, causing it to appear darker. This account of simultaneous contrast, which is almost universally presented in textbooks, is featured in two well-known models, those of Cornsweet (
1970) and of Jameson and Hurvich (
1964). We will refer to these models as contrast theories.
A glitch in this story immediately arises because lateral inhibition has a limited spatial reach. Thus, the square on the white background should not appear homogeneous. Its darkening should be most pronounced near its border with the white background. To solve this problem, different writers have suggested that either (1) the square does not really appear homogeneous (Cornsweet,
1970; Davidson,
1968), (2) the rates of firing are averaged within borders, or (3) the lateral inhibition only creates edge signals and the regions between edges are filled in by higher processes.
With the advent of the computer came more sophisticated theories of lightness, based on perceptual decomposition of the retinal image (Bergström,
1977; Gilchrist
1979). These models gave a much stronger account of lightness constancy. However, they were not able to address simultaneous lightness contrast. According to Gilchrist's intrinsic image model (Gilchrist
1979; Gilchrist, Delman, & Jacobsen,
1983), for example, the two targets in the contrast display should appear identical.
Models based on lateral inhibition have also grown more sophisticated in recent decades. These are not lightness models per se. They are called brightness models because they claim to model the human response to luminance. They generally fail to model images that contain either illumination edges or depth edges. These models incorporate the more modern view that lateral inhibition in humans takes the form of receptive fields, either with on-centers and off-surrounds or vice versa. There are more than a dozen such models. All of them start by taking a difference of Gaussians at multiple scales. The models vary in terms of how the outputs from the filters are combined. Watt and Morgan (
1985) combine the outputs from all scales and then apply interpretation rules, whereas Kingdom and Moulden (
1992) apply the interpretation rules to each scale before combining scales. Morrone and Burr (
1988) use even- and odd-symmetric filters. Kingdom and Moulden use on-center cells only, whereas other models (Pessoa, Mingolla, & Neumann,
1995) use both on- and off-center cells. Some of the models predict illusory brightness scallops near boundaries, even in cases where observers fail to see them. Grossberg and Todorović (
1988) and Pessoa et al. (
1995) solve this problem by filling in homogeneous regions based on edge signals, whereas Heinemann and Chase (
1995) average all activation within boundaries. McArthur and Moulden (
1999) have published one of the few 2D models although, as they admit, there are serious failures.
In this paper, we will consider only the oriented difference of Gaussians (ODOG) model of Blakeslee and McCourt (
2003), as it seems to be regarded as the best exemplar of this class of models. The ODOG model is 2D and, in addition to the conventional multiple scales, this model uses oriented filters with multiple orientations. Each filter consists of a central excitatory region flanked by a pair of inhibitory regions. For each point in the stimulus, the output of every filter, including all scales and all orientations, is summed with one important qualification. The outputs of each orientation are normalized to the same maximum value. This latter feature allows the model to account, at least qualitatively, for White's (
1981) illusion, a kind of reverse simultaneous contrast illusion.