In classical assimilation effects, intermediate luminance patches appear lighter when their immediate surround is comprised of white patches and appear darker when their immediate surround is comprised of dark patches. With patches either darker or lighter than both inducing patches, the direction of the brightness effect is reversed and termed as “inverted assimilation effect.” Several explanations and models have been suggested, some are relevant to specific stimulus geometry, anchoring theory, and models that involve high level cortical processing (such as scission, etc.). None of these studies predicted the various types of assimilation effects and their inverted effects. We suggest here a compound brightness model, which is based on contrast–contrast induction (second-order adaptation mechanism). The suggested model predicts the various types of brightness assimilation effects and their inverted effects. The model is composed of three main stages: (1) composing post-retinal second-order opponent receptive fields, (2) calculations of local and remote contrast, and (3) adaptation of the second-order (contrast–contrast induction). We also utilize a variation of the Jacobi iteration process to enable elegant edge integration in order to evaluate the model is performance.

- Anderson's (1997) suggestion that the mechanism of visual “scission” treats the gray lines as separate transparent layers;

*fc*(

*x*−

*x*

_{0},

*y*−

*y*

_{0}) (Shapley & Enroth-Cugell, 1984):

*L*

_{photo-r}at locations

*x*

_{0},

*y*

_{0}represents the response of the center area of the SORF, which is centered at location

*x*

_{0},

*y*

_{0}. The following equations are similarly expressed, but for the sake of simplicity,

*x*

_{0},

*y*

_{0}at the following equations are substituted by zero:

*x*

_{0}=

*y*

_{0}= 0.

*k*is an index that represents the specific spatial resolution, where

*k*= 1 is the finest resolution.

*f*

_{c}is defined as:

*ρ*represents the radius of the center region of the receptive field. The “Surround” signals are similarly defined, with a spatial weight function three times larger in diameter than that of the “center.”

*f*

_{s}is defined as a decaying Gaussian over the surround region:

*f*

_{c}and

*f*

_{s}is 1.

*W*

^{k}is a non-linear spatial weight function which is dependent also on the local opponent response,

*L*

_{sorf}

^{k}. (Equation 7). Due to the nature of opponent signals (Equation 5) which perform edge detection at different resolutions, they yield a signal only at locations adjacent to edges. Thus, a common local averaging of such signals would lead to an erroneous low value of average contrast as it would always consider low values that derive from locations that are not adjacent to edges. In order to overcome that obstacle, we propose that in this local area, higher SORF response obtain higher weight function, when calculating their contribution to the local contrast.

*C*

_{remote}) represents the average contrast in this remote area. The remote signal is defined as an inner product of the local contrast (

*C*

_{local}, Equation 6) response at each location of the remote area with a non-linear decaying Gaussian weight function (

*W*

_{c}) (or as a convolution for all spatial locations). This weight function obtains higher values for higher local contrast values (as was also explained in local contrast calculations, Equation 6).

*β*is a constant that determines the degree of gain. The response of each SORF cell after adaptation is therefor:

*k*= 1), this central area is simulated by a single pixel. Accordingly, the “center” size for the finest grid was chosen as 1 pixel, i.e.,

*L*

_{cen}(Equation 1) is equal to the original image luminance (at each spatial location). For different

*k*values, the center size was chosen as a diameter of k pixels. The “surround” diameter (Equations 3 and 4) was chosen as 3 times larger than the center diameter (Shapley & Enroth-Cugell, 1984). This ratio was applied for all

*k*resolutions. The “center” and “surround” calculations were applied, and

*L*

_{sorf}

^{k}calculation (Equation 5) was applied by a convolution of the original luminance values with a filter that represent the subtraction of center and surround contributions, for example, when

*k*= 1 the convolution kernel is:

*k*= 1 is presented in Figure 2. As expected, the responses have been obtained only adjacently to the edges, where the background stripes (black and white in the original image; Figure 1) yield higher “gradient” response (blue and red striped in Figure 2) than the responses to the stripes along the location of the “test” stripes (cyan and yellow in Figure 2).

*C*

_{local}(Equation 6) was performed with a

*ρ*

_{local}diameter of 10 pixels for all resolutions. In order to enable the influence of more distant edges, the cutoff of this calculation was taken in a diameter of 5*

*ρ*

_{local}. This large spatial range of cutoff is required due to the significant weight given to high contrast values even at distant locations (Equation 7). The resolutions where calculated with

*k*= 1, 3, 5, 7 (model, Equations 1–5).

*C*

_{local}) of the White's effect is demonstrated in Figure 3. The figure clearly demonstrates the different local contrast regions. The white background represents the high background contrast, while the two gray squares represent the lower contrast squares, derived as a result of regions with the gray stripes. Note that the squares' contrast differ from each other due to the value of the gray stripe which is not the median value between the values of the white and the black stripes.

*C*

_{remote}(Equation 8) was performed with a

*ρ*

_{remote}diameter of 10 pixels for all resolutions. The cutoff of this calculation was taken as a diameter 15*

*ρ*

_{remote}. This chosen “remote” size is also within the range of the reported electrophysiological findings (Creutzfeldt, Crook, et al., 1991; Creutzfeldt, Kastner, et al., 1991). The intermediate result of this remote contrast is demonstrated in Figure 4. It can be shown that the boarders of the squares are not existent any more due to the larger spatial weight function. Note that remote contrast is calculated over the local contrast response (Equation 6).

*C*

_{remote}signals have a smaller response range than the range of local signals (Figure 4) due to the large integration area of the

*C*

_{remote}signal. Since the

*C*

_{remote}signal calculation in the “test” region includes also the background contrast (the white area in Figure 4), the result of the

*C*

_{remote}signal is an average of the contrasts of both “test” and “background” regions. Consequently, the

*C*

_{remote}signal in the “test” area obtains higher values than the local contrast at the same “test” area. In other cases (such as in the inverted White's effect; Figure 11), the

*C*

_{remote}signal of the “test” area can obtain lower values than the local contrast signal at the same “test” area.

*L*

_{sorf-adapted}, are then calculated (Equations 10–11) with

*β*= 0.4. A smaller value of beta causes a stronger contrast–contrast induction.

*k*= 1, along a vertical line located at

*x*= 300 (yellow vertical line in Figure 7A) in Figure 5. The adapted SORF,

*L*

_{sorf-adapted}, presents the perceived local DOG, Equation 11. Therefore, suppression or facilitation of these responses, in comparison to the original DOG (

*L*

_{op}), presents contrast suppression (assimilation effects) or contrast facilitation (inverted assimilation effects) respectively. Figure 5 demonstrates the contrast suppression in the “test” area, which predicts the assimilation effect. In order to transform these perceived values,

*L*

_{sorf-adapted}, into perceived luminance values,

*L*

_{per}, a further inverse function has to be performed.

*L*

_{per}) to reflect the perceived luminance values. These values are presented as perceived images (Figures 7–17, results). Since the gain factor (Equation 10) is uniform over all resolutions, it is plausible to demonstrate the inverse function only through the finest resolution (

*k*= 1). Since this stage is presented mainly for evaluation of the model, different numerical approaches could be applied at this stage.

*L*

_{sorf-adapted}) is approximately a convolution of the above kernel (Equation 13) with

*L*

_{per}, therefore,

*n, m*represent the location index. An extraction of

*L*

_{per}would yield an iterative function:

*i*represents the

*i*th iteration. This is in fact, a variation of the Jacobi method (Quarteroni & Valli, 1994).

*L*

_{per}

^{(0)}, at the inverse function according to the variation of the Jacobi method was chosen as the original image at all locations, for this example. After applying Equation 15, the estimated perceived luminance iterations:

*L*

_{per}

^{(1)},

*L*

_{per}

^{(100)}, and

*L*

_{per}

^{(1000)}are obtained and shown in Figure 6. After 100 iterations (Figure 6B), the process starts to converge. Iteration 1000 (Figure 6C) shows the predicted luminance image. At this stage, we emphasize the feasibility of the model and not the efficiency of this inverse procedure.

*L*

_{sorf-adapted}(Equation 11) to a luminance image. This method can be compared to previous filling-in used methods suggested for mechanisms of assimilation effects (Grossberg, 2003; Spehar et al., 2002). As far as we know, the previous used filling-in methods, which relate to modeling psychophysical effects (such as assimilation and Kanizsa effects), do not enable application with real images. The modified Jacobi method (Quarteroni & Valli, 1994) used here (Equation 15) enables this application and with less arbitrary constrains. It has to be mentioned that previous models which applied filling-in components used these components as part as the model. In the presented model, however, the filling-in component is used only as part of the inverse function in order to transform the “gradient” image to luminance image.

*L*

_{sorf-adapted}in Equation 15).

*Spatial vision*. New York: Oxford University Pres.