**We present a computational-observer model of the human spatial contrast-sensitivity function based on the Image Systems Engineering Toolbox for Biology (ISETBio) simulation framework. We demonstrate that ISETBio-derived contrast-sensitivity functions agree well with ones derived using traditional ideal-observer approaches, when the mosaic, optics, and inference engine are matched. Further simulations extend earlier work by considering more realistic cone mosaics, more recent measurements of human physiological optics, and the effect of varying the inference engine used to link visual representations to psychophysical performance. Relative to earlier calculations, our simulations show that the spatial structure of realistic cone mosaics reduces the upper bounds on performance at low spatial frequencies, whereas realistic optics derived from modern wave-front measurements lead to increased upper bounds at high spatial frequencies. Finally, we demonstrate that the type of inference engine used has a substantial effect on the absolute level of predicted performance. Indeed, the performance gap between an ideal observer with exact knowledge of the relevant signals and human observers is greatly reduced when the inference engine has to learn aspects of the visual task. ISETBio-derived estimates of stimulus representations at various stages along the visual pathway provide a powerful tool for computing the limits of human performance.**

*Image-computable*means that the calculations begin with a quantitative description of the visual image. An important special case supported by ISETBio is planar images presented on a computer display or the printed page. ISETBio also includes support for a more general case, in which the input is defined using a three-dimensional description that includes the location and shape of scene elements as well as the spectral properties of each scene element. In this more general case, the retinal image is derived from the scene representation using ray-tracing methods (Pharr & Humphreys, 2010).

*computational-observer analysis*to describe image-computable methods combined with an inference engine (Lopez, Loew, Murray, & Goodenough, 1992).

*inference engine*to describe methods that link the computed visual-system responses to psychophysical performance. Inference-engine methods make decisions in a simulated visual task based on the stimulus representation at different processing stages along the visual pathway. Figure 2A and 2B depicts the retinal-image contrasts seen by the L-, M-, and S-cones for, respectively, the null stimulus and for a 16-c/° grating test stimulus with 100% Michelson contrast. Note that for this stimulus, aberrations reduce the retinal-image L- and M-cone contrast by a factor of 2 relative to the stimulus image, whereas the retinal S-cone contrast is reduced by a factor of 10. Aberrations also shift the spatial phase of the retinal S-cone contrast with respect to those of L- and M-cone contrasts (Figure 2B).

*P*

_{correct}) was estimated for each grating contrast and spatial frequency separately. We simulated a two-alternative forced-choice task, using an ideal-observer classifier that selects which of the two alternative stimulus sequences (test–null or null–test) was more likely to generate the observed cone excitations. The test stimulus was a spatial grating of known contrast, frequency, and position; the null stimulus was a spatially uniform field. The simulated duration of the test and null stimulus was 100 ms on each trial, with the stimuli presented in random order. The ideal observer's performance was calculated analytically given knowledge of the sequence of mean number of excitations during the 100-ms intervals and the assumption of Poisson noise. Responses were binned using 5-ms bins. This choice is irrelevant in the present simulations, which model static stimuli in the absence of eye movements, and the results are indeed the same if a single 100-ms bin is used for each interval. The choice of 5-ms bins is to allow direct comparison with future work which will include fixational eye movements (Cottaris, Rieke, Wandell, & Brainard, 2018). The psychometric function (

*P*

_{correct}as a function of stimulus contrast) for each spatial pattern was fitted with a cumulative Weibull (Kingdom & Prins, 2010; http://www.palamedestoolbox.org), and threshold was computed as the stimulus contrast corresponding to

*P*

_{correct}= 0.7071. Contrast sensitivity is the reciprocal of threshold contrast.

^{2}and a 3-mm pupil. The optical PSF matched the one used by Banks et al.

*S*

_{t}and one specifying the null stimulus

*S*

_{n}are processed through the ISETBio simulation pipeline. A total of

*N*response instances are computed for each of the test and null stimuli,

*P*

_{correct}as a function of stimulus contrast). A cumulative Weibull function is fitted to the data, and the contrast level at

*P*

_{correct}= 0.7071 is considered the threshold. We compared the sensitivity of the ideal observer to that of different SVM-based computational-observer inference engines.

*x*,

*y*chromaticity and luminance). A spectral representation is necessary to model chromatic aberration, inert pigments, and absorption of light by the three classes of cone photoreceptors. To promote the colorimetric specification to spectral radiance, we simulated the scenes as arising from a typical color cathode-ray tube from the era when their article was published. The critical display information is the R-, G-, and B-channel spectral power distributions, the RGB display quantization, and the pixel spatial sampling. Because our interest here is not the effect of display properties per se, we modeled a cathode-ray tube with 18-bit linear control of the R, G, and B primary intensities. We also set the pixel spatial sampling to be inversely proportional to the stimulus spatial frequency; consequently, all stimuli were represented on a 512 × 512 spatial grid mapped onto the corresponding retinal region in a manner that took the stimulus size into account. The spectra we model differ somewhat from those in the Banks et al. experiment, as that experiment was performed using a monochrome cathode-ray tube with a P4 phosphor.

*d*(

*λ*) to the Zernike polynomials according to the formula given by Howarth and Bradley (1986):

*λ*

_{focus}= 550 nm. The computed PSF is translated in space so that its center of mass at the in-focus wavelength is centered at the origin. This is done so as to eliminate performance differences due to off-centered PSFs.

*subject*to refer to a specific eye of a particular subject. The subjects are referred to as Subjects 1, 2, 3, 4, and 5; we selected Subject 3 as typical. The Zernike coefficients for these five subjects are provided in Table 1.

*RI*(

*x*,

*y*,

*λ*) is transformed into a pattern of cone photoisomerization rates are the macular pigment, which differentially absorbs short-wavelength photons; the spectral quantal efficiency, or spectral absorptance, of the cone photopigment, which controls the proportion of incident photons that get absorbed by the photopigment; the cone aperture diameter, which determines the photon-collecting area of a cone and also acts as a spatial low-pass filter; and the cone lattice, which controls the spatial sampling of the retinal irradiance image.

*T*

_{macular}(

*λ*), is depicted in Figure 11A. Minimum transmittance is 0.45 at 460 nm. The foveal spectral quantal efficiencies (absorptances) of different cone classes

*q*(

_{c}*λ*), with

*c*= {

*L*,

*M*,

*S*}, are depicted in Figure 11B and are computed based on the Stockman–Sharpe normalized absorbance values

*SS*(

_{c}*λ*) (Stockman et al., 1999; Stockman & Sharpe, 2000), as

*q*

_{peak}is the peak cone quantal efficiency (0.667 for all cone types) and

*c*(0.5 for L- and M-cones and 0.4 for S-cones). These values are within the range of optical densities reported (0.29–0.91 for L-cones, 0.36–0.97 for M-cones; Renner, Knau, Neitz, Neitz, & Werner, 2004).

_{k}*A*(

*x*,

*y*), whose diameter corresponds to the inner-segment diameter and whose volume is 1. In the Banks et al. (1987) mosaic, the inner-segment diameter is 3 μm, whereas in the ISETBio mosaics it is 1.6 μm in the fovea. The aperture filters and corresponding MTFs for these mosaics are depicted in Figure 11C. Note that the MTF at 60 c/° is 0.63 for the Banks et al. mosaic and 0.89 for the eccentricity-varying cone mosaics. Although we varied the size of the inner-segment diameter with eccentricity when we computed cone excitations, we used a constant inner-segment diameter (foveal value) when computing blur by the cone apertures. This choice was made for reasons of computational efficiency. We have verified, however, that using the mean aperture value across all cones in a mosaic produces essentially indistinguishable contrast-sensitivity curves, as the effects of the optical PSF dominate the effects of the aperture.

*CER*(

_{c}*x*,

*y*) for each cone class

*c*, the retinal image

*RI*(

*x*,

*y, λ*) was first filtered with the macular pigment transmittance

*T*

_{macular}(

*λ*), multiplied by the corresponding spectral quantal efficiency

*q*(

_{c}*λ*) and integrated numerically over wavelength. The result was then spatially convolved with the cone aperture

*A*(

*x*,

*y*):

*τ*= 5 ms. Specifically, for a cone

*k*of class

*c*located at coordinates (

_{k}*x*,

_{k}*y*), the mean count of cone excitation events

_{k}*τ*ms is computed by spatially sampling the continuous function

*x*=

*x*,

_{k}*y*=

*y*), multiplying by the cone inner-segment area

_{k}*α*and the time interval

*τ*:

*T*

_{macular}(

*λ*) in Equation 2 with

*SS*

_{macular}(

*λ*) is the spectral sensitivity of the macular pigment and

*OD*

_{macular}(

*x*,

*y*) is a factor describing the eccentricity-dependent variation in the optical density of the macular pigment, modeled as by Putnam and Bland (2014):

*OD*

_{macular}(0, 0) is the optical density of the macular pigment at the fovea, which is set to 0.35 (Putnam & Bland, 2014).

*b*for the

_{k}*k*th cone located at (

*x*,

_{k}*y*), defined as

_{k}*α*(

*x*,

_{k}*y*) at location (

_{k}*x*,

_{k}*y*) relative to its foveal value

_{k}*α*(0, 0). The quantity

*c*at location (

_{k}*x*,

_{k}*y*) relative to its foveal value, and is computed as the mean value of

_{k}*λ*, with

*b*correction factor (Equation 7) to the mean count of excitation events

_{k}^{2}, which is near the high end of the cone-density range in human subjects (Curcio et al., 1990). In practice, the eccentricity-based mosaics used in the primary calculations, which are synthesized stochastically as described later, had peak cone sensitivities in the range of 270,353–290,448 cones/mm

^{2}. The ratio of cone diameter to inner-segment aperture was 0.79 across all eccentricities, close to the 0.82 suggested by Miller and Bernard (1983; see also Curcio et al., 1990).

*σ*

_{0,0}= 2 μm (Figure 12A1 and 12B1). In the first iteration, the lattice is subsampled, and a node located at (

*x*,

*y*) is eliminated with an eccentricity-dependent probability

*σ*

_{x}_{,}

*is the desired cone spacing at (*

_{y}*x*,

*y*), taken from Curcio et al. (1990). The subsampled spatial mosaic approximates the desired eccentricity-dependent cone density, but cone coverage is nonuniform, with regions without any cones (Figure 12A2 and 12B2). To improve the uniformity of the mosaic, an iterative procedure is used (Persson, 2005). In this approach, a cone and its neighboring cones are subjected to simulated movement driven by mutually repulsive forces. The magnitude of the repulsive force between a target cone

*i*and a neighboring cone

*j*is given by

*k*is set to a value >1. Therefore, when the spacing between two cones is smaller than the desired spacing, a positive force is generated which tends to pull these cones apart, whereas when the spacing is larger than the desired one, no force is acting between them. The mutually repulsive forces spread cones around the mosaic, filling in regions with no cones (Figure 12A2–12A5 and 12B2–12B5). Cones moved outside of the mosaic boundary are forced back inside the boundary. To avoid irregularities at the mosaic edges, the extent of the boundary is usually 20% larger than the width of the mosaic to be generated.

*K*neighbors,

*K*neighboring cones are determined by Delaunay triangularization. The update rule is

*δ*is the update step, set to 0.2 ×

*σ*

_{0,0}. The iterative position-adjustment process is terminated when nodes move less than a threshold value. Snapshots of the mosaic at iterations 10, 100, and 1,055 are depicted in Figure 12A3–12A5, along with the isodensity contour lines of the achieved and the desired cone-density profiles.

*M*is the number of cones in the mosaic. The spatial profile of

*V*(

*k*) was derived from the spatial contrast modulation of the test stimulus: The weight associated with cone

*k*was the spatial contrast of the test stimulus at the location corresponding to the spatial position of that cone (Figure 13). Note that use of a stimulus-matched template of this sort would be optimal if mean cone responses were perturbed only by independent identically distributed Gaussian noise. For the Poisson-noise model considered here, there is no single spatial-pooling template that is optimal across stimulus contrasts.

*N*instances and

*T*time bins, of each cone

*k*to the null stimulus,

*V*(

*k*) template:

*V*(

*k*) and

*V*(

_{Q}*k*). The

*V*(

_{Q}*k*) kernel was derived from the contrast modulation of the spatial-quadrature version of the test stimulus. The responses of these spatial-pooling mechanisms were squared and summed, to yield an energy response

*, 2 (1): 8, 121–131, https://doi.org/10.1167/2.1.8. [PubMed] [Article]*

*Journal of Vision**, 287, 163–176. [PubMed]*

*The Journal of Physiology**, 1 (1), 1–17, https://doi.org/10.1146/annurev-vision-082114-035905. [PubMed]*

*Annual Review of Vision Science**, 8, 1775–1787, https://doi.org/10.1364/JOSAA.8.001775. [PubMed]*

*Journal of the Optical Society of America A**(pp. 163–202). New York: Academic Press.*

*Photophysiology, Vol. 2**, 14 (12): 22, 1–22, https://doi.org/10.1167/14.12.22. [PubMed] [Article]*

*Journal of Vision**. London, UK: Arnold.*

*Physiology of the retina and the visual pathway**Computational observer modeling of the limits of human pattern resolution*. In

*, Reno, NV.*

*OSA Fall Vision Meeting*, October*, 292 (4), 497–523, https://doi.org/10.1002/cne.902920402. [PubMed]*

*The Journal of Comparative Neurology**, 10 (7), 553–564. [Article]*

*Physica**, 15, 431–436, https://doi.org/10.1111/j.0956-7976.2004.00697.x. [PubMed]*

*Psychological Science**, 45 (1), 352–356. [Article]*

*SID Symposium Digest of Technical Papers**, 96 (2), 267–314. [PubMed]*

*Psychological Review**(pp. 25.1–25.55). New York: McGraw Hill.*

*Handbook of optics: Volume 1. Fundamentals, techniques, and design**, 16 (2): 025003, https://doi.org/10.1101/206409.*

*Journal of Neural Engineering**. Greenwood Village, CO: Roberts & Co.*

*Introduction to Fourier optics*(3rd ed.)*. Bellingham, WA: SPIE Optical Engineering Press.*

*CCD arrays, cameras and displays*(2nd ed.)*(pp. 177–181). http://biorxiv.org/content/early/2017/02/17/109405*

*Electronic Imaging, Human Vision and Electronic Imaging**, 46 (1), 231–233. [Article]*

*SID Symposium Digest of Technical Papers**. New York: John Wiley and Sons.*

*Color in business, science, and industry**, 10 (11), 1–28, https://doi.org/10.1371/journal.pcbi.1003915. [PubMed]*

*PLoS Computational Biology**. San Diego, CA: Academic Press.*

*Psychophysics: A practical introduction**, 1 (1), 417–446, https://doi.org/10.1146/annurev-vision-082114-035447. [PubMed]*

*Annual Review of Vision Science**.*

*IS&T Electronic Imaging Conference*, San Francisco, CA*. Cambridge, UK: Cambridge University Press.*

*Introduction to information retrieval**, 11 (12), 3113–3122. [Article]*

*Journal of the Optical Society of America A**, 5 (229), 229–240, https://doi.org/10.1038/nrn1348. [PubMed]*

*Nature Reviews Neuroscience**, 11 (5): 2, 1–25, https://doi.org/10.1167/11.5.2. [PubMed] [Article]*

*Journal of Vision**(pp. 3–24). Cambridge, UK: Cambridge University Press.*

*Vision: Coding and efficiency**Massachusetts Institute of Technology*.

*. San Francisco: Morgan Kaufmann.*

*Physically based rendering: From theory to implementation*(2nd ed.)*, 2 (2), 124–134, https://doi.org/10.1364/OPTICA.2.000124.*

*Optica**(pp. 183–255). Amsterdam, the Netherlands: Elsevier.*

*Handbook of biological physics, Vol. 3: Molecular mechanisms of visual transduction**, 56, 1141–1142. [Article]*

*Journal of the Optical Society of America A**. Sunderland, MA: Sinauer.*

*The first steps in seeing**. Cambridge, MA: MIT Press.*

*Learning with kernels**, 292 (5823), 543–545, http://doi.org/10.1038/292543a0. [PubMed]*

*Nature**, 112 (778), 428–450. [Article]*

*Proceedings of the Royal Society of London B: Biological Sciences**(pp. 1.11–11.104). New York: McGraw Hill.*

*The Optical Society of America handbook of optics, Volume: 3: Vision and vision optics**, 4 (4), 213–221. [Article]*

*Transactions of the IRE Profession Group in Information Theory**, 18 (8): 6, 1–18, https://doi.org/10.1167/18.8.6. [PubMed] [Article]*

*Journal of Vision**, 86 (6), 363–370, https://doi.org/10.1111/j.1444-0938.2003.tb03080.x. [PubMed]*

*Clinical and Experimental Optometry**. Sunderland, MA: Sinauer.*

*Foundations of vision**, 15 (2): 26, 1–25, https://doi.org/10.1167/15.2.26. [PubMed] [Article]*

*Journal of Vision**, 4 (8): 51, https://doi.org/10.1167/4.8.51. [Abstract]*

*Journal of Vision**, 5 (9): 6, 717–740, https://doi.org/10.1167/5.9.6. [PubMed] [Article]*

*Journal of Vision**, 12 (10): 12, 1–16, https://doi.org/10.1167/12.10.12. [PubMed] [Article]*

*Journal of Vision**(pp. 1–30). Berlin, Germany: Springer.*

*Progress in sensory physiology**. New Work: John Wiley & Sons.*

*Color science: Concepts and methods, quantitative data and formulas*