As stated in the Introduction, we want to derive a method that
matches induction effects among screens of different size, not a method that
estimates induction effects and their appearance. The difference is very relevant, and it is similar to the fact that colorimetry and color spaces allow us to determine quite accurately when two colors are perceived as different or the same, but they cannot tell us the perceived appearance of said colors, as there are many external factors that play a role in this; we must remark though that this approach contrasts with that of works like (
Bertalmío et al., 2020), where the output of the algorithm was explicitly simulating the appearance.
Let’s say that we have a color appearance model \(M\) that is invertible and capable of reproducing induction effects. We consider two viewing scenarios A and B in which the same image stimulus \(I\) is presented on a display, and both scenarios have identical viewing conditions except that the screen in A has a different size than the screen in B. In this study, we isolate for viewing angle and its effect on color perception; it is well-known that other viewing parameters, like ambient illumination, screen luminance and dynamic range, display color gamut, and so on, may have a significant impact on perception, but the usual practice in the literature, given the challenges in modeling vision, is to vary one of these elements while the others are kept fixed. The model \(M\) predicts for image \(I\) an appearance \(M_A(I)\) in scenario A and an appearance \(M_B(I)\) in scenario B. These appearances will be different because \(M_A\) and \(M_B\) are two instances of model \(M\) that will generally have different parameter values. The reason for this is that, as we mentioned in the section on neural processes adapt to the scene statistics, and by scene we mean the whole field of view, a part of which is the screen where the image stimulus is displayed. Therefore, different viewing angles will result in different scenes, consequently yielding different adaptation processes. In fact, in linear-nonlinear (L+NL) models of vision (and the model M we will be proposing shortly will be of this kind), adaptation is actually defined as the change of the model parameters when the input changes, and the full-view scenes in A and B provide different inputs to the visual system because the viewing angle of the screen is different.
Then, our induction matching goal can be expressed as determining the parameters for the compensation method \(C= M_B^{-1} \cdot M_A\), because when the preprocessed image \(C(I)\) is shown on screen \(B\) its appearance, including induction effects, will be \(M_B(C(I))=M_B \cdot M_B^{-1} \cdot M_A (I)=M_A(I)\), that is, the same as if the image was seen on screen \(A\). In short, having an invertible appearance model \(M\) for induction allows us to have an explicit analytical expression for \(C\), and the parameter values for \(C\) might be found so that they match psychophysical data. Furthermore, and very importantly, we don’t need to optimize \(M\) so that it accurately predicts induction effects in image appearance, which is a very challenging open problem: we just need to optimize \(C\) so that the induction effects match in the two conditions.
This implies, however, that neither the LHEI model nor any of the color induction models in the literature (e.g.,
Otazu et al., 2010;
Song et al., 2019) can be used for our induction compensation goal, as they are not invertible. In what follows, we show how to modify the LHEI model so as to make it invertible.
In
Kim, Batard, and Bertalmío (2016), the authors went back to the retinal models that were updated and analyzed in
Yeonan-Kim and Bertalmío (2016b), studied what were their most essential elements, and produced the simplest possible form of equations to model the retinal feedback system that are nonetheless capable of predicting a number of significant contrast perception phenomena like brightness induction (assimilation and contrast) and the band-pass form of the contrast sensitivity function. These equations form a system of partial differential equations that minimize an energy functional, closely related to the one of the LHE method of
Bertalmío et al. (2007), but where the absolute value function in the second term of
Equation 1 is raised to the power of two. This has the effect of
regularizing the functional, making it convex, and therefore its minimum can be computed with a single convolution, whereas the functional in
Bertalmío et al. (2007) is non-convex and as a consequence its minimum has to be found by the iteration of the gradient descent equation. If we follow this approach to modify the contrast term of the energy functional associated to the LHEI method (
Equation 6), we obtain:
\begin{eqnarray}
\int _{\Omega ^2} K_c(x,y)(J(x)-J(y))^2dxdy , \quad
\end{eqnarray}
where as usual
\(\Omega\) is the rectangular domain of the image that is displayed (i.e., not the whole field of view).
With this modification, the gradient descent equation previously shown in
Equation 5 now becomes:
\begin{eqnarray}
J_t(x)&=&-\,\alpha (J(x)-K_m*J(x)) \nonumber\\
&&+\,\gamma \int _\Omega K_c(x,y)(J(x)-J(y))dy
\nonumber\\
&&-\,\beta (J(x)-J_0(x))\end{eqnarray}
Now the minimum can be computed directly by convolving the input image
\(J_0\) with a kernel
\(S\):
\begin{equation}\!\!\!\!\!\!\!\!\!\!\!
S = \mathcal {F}^-1\left( \frac{\beta }{\alpha + \beta - \gamma -\alpha \mathcal {F}(K_m) + \gamma \mathcal {F}(K_c)} \right),
\end{equation}
where
\(\mathcal {F}\) represents the Fourier transform. The kernel
\(S\) clearly has an inverse kernel
\(S^{-1}\) such that
\(S * S^{-1} = \delta\):
\begin{equation}\!\!\!\!\!\!\!\!\!\!\!\!
S^{-1} = \mathcal {F}^-1\left( \frac{\alpha + \beta - \gamma -\alpha \mathcal {F}(K_m) + \gamma \mathcal {F}(K_c)}{\beta } \right)\end{equation}
We propose the following modified version of the LHEI model, also consisting of two stages:
Let’s call this model \(M\).
The output
\(O=M(I)\) can be expressed as
\(M(I)=S*NR(I)\). The inverse of the Naka-Rushton equation is
\begin{equation}
NR^{-1}(J)=I_s\cdot \left(\frac{J}{1-J}\right)^{\frac{1}{n}},
\end{equation}
and the inverse kernel
\(S^{-1}\) was defined in
Equation 10. Therefore, the inverse of
\(M\) can be expressed as
\(M^{-1}(O)=NR^{-1}(S^{-1}*O)\).