We restrict our model to static, gray-scale luminance images projected onto a single cyclopean fovea. This excludes any kind of temporal changes beyond the very coarse separation by presentation duration we made in our model. A true processing of stimuli over time would go beyond our current computational capabilities. Nonetheless, it is worth highlighting that temporal processing was investigated and seems to be explainable by two or maximally three temporal channels (Watson,
1986; Watson & Nachmias,
1977). However, we are not aware of a combination of these models for temporal processing with masking or discrimination models. Furthermore, luminance images exclude color processing, which requires considerably more complex models of the optics to include chromatic aberrations (Bedford & Wyszecki,
1957; Charman & Jennings,
1976) and of the retinal sampling, adaptation and processing, which differ between color channels (Brainard,
2015). Additionally, cortical processing of color is understood less completely (Gegenfurtner,
2003). Finally, luminance images contain no depth information, which relieves us from explicitly modelling 3D scenes, the optical effects on objects outside the focal plane and binocular vision. Modelling binocular vision is possible, but results in considerably more complex psychophysical models (Baker, Meese, & Georgeson,
2007; Georgeson, Wallis, Meese, & Baker,
2016; Legge,
1984a,
1984b; Meese, Georgeson, & Baker,
2006). The additional complexity arises, because human observers do not only nontrivially combine the binocular input into one combined image, but can also perceive disparity (spatial shift between eyes) and luster (contrast differences between eyes). Under dichoptic presentation, these additional channels can lead to interesting unintuitive results (e.g., May & Zhaoping,
2016).