**Abstract**:

**Abstract**
Watson and Ahumada (2008) described a template model of visual acuity based on an ideal-observer limited by optical filtering, neural filtering, and noise. They computed predictions for selected optotypes and optical aberrations. Here we compare this model's predictions to acuity data for six human observers, each viewing seven different optotype sets, consisting of one set of Sloan letters and six sets of Chinese characters, differing in complexity (Zhang, Zhang, Xue, Liu, & Yu, 2007). Since optical aberrations for the six observers were unknown, we constructed 200 model observers using aberrations collected from 200 normal human eyes (Thibos, Hong, Bradley, & Cheng, 2002). For each condition (observer, optotype set, model observer) we estimated the model noise required to match the data. Expressed as efficiency, performance for Chinese characters was 1.4 to 2.7 times lower than for Sloan letters. Efficiency was weakly and inversely related to perimetric complexity of optotype set. We also compared confusion matrices for human and model observers. Correlations for off-diagonal elements ranged from 0.5 to 0.8 for different sets, and the average correlation for the template model was superior to a geometrical moment model with a comparable number of parameters (Liu, Klein, Xue, Zhang, & Yu, 2009). The template model performed well overall. Estimated psychometric function slopes matched the data, and noise estimates agreed roughly with those obtained independently from contrast sensitivity to Gabor targets. For optotypes of low complexity, the model accurately predicted relative performance. This suggests the model may be used to compare acuities measured with different sets of simple optotypes.

*σ*, added to the neural image. The noise power spectral density

*N*is a key parameter of the model and is defined as where

*A*is the area in square degrees of a single pixel in the model. We usually express

*N*in the logarithmic unit of dBB, as explained in the Appendix.

^{2}. Pupil diameter was not specified or controlled. Based on the mean age of the observers (22.8 years) and the relationship between luminance, age, display size, and pupil diameter (Stanley & Davies, 1995; Watson & Yellott, 2012; Winn et al., 1994), we assumed a diameter of 6 mm. Further details on our selection of pupil size are provided in the Appendix.

**C**of cross-correlations among the neural images (Equation 2), we accelerated computation of the model by precomputing the matrix

**C**for each optotype set and size. Each matrix was 10 × 10, and there were seven optotype sets, 32 sizes for each set, and 200 eyes, so the result was a data structure with dimensions 200 × 7 × 32 × 10 × 10.

*K*, and let

*s*be the neural image for the optotype indexed by

_{k}*k*. Let

**C**be the

*K*×

*K*matrix of cross-correlations among the

*K*neural images, where ⊙ indicates the sum of the pixel-by-pixel product of the two images (the dot product of the two images regarded as vectors). Let

**e**be the vector consisting of the diagonal of this matrix, corresponding to the energies of the neural images. Let

*σ*be the standard deviation of the Gaussian noise added to each pixel of the neural image. Let

*k*be the index of the letter presented. Then we consider the vector

**g**where

**m**is a random vector of length

*K*, constructed as described in Watson and Ahumada (2008). The observer locates the largest entry of

**g**, and returns its index

*j*as the index of the optotype identified. This algorithm corresponds to the behavior of an ideal observer of a signal known exactly.

*σ*and conduct

*T*trials for each optotype. The optotype size in a given presentation is controlled by an adaptive Quest procedure (Watson & Pelli, 1983). This procedure analyzes past trials and sets the current size to the current estimate of acuity. The procedure is customized so that

*K*trials, one per optotype, are presented at a given size before a new size is selected. The Quest method provides a highly efficient way of estimating acuity from the model. To illustrate the simulation, we provide a demonstration in the Appendix in which the reader can select a set of optotypes and a noise level.

*σ*, and for each letter size used for that observer, we conduct

*T*trials for each optotype. The result consists of a list of confusion matrices, indexed by letter size, for each observer.

*σ*at 0.24 (

*N =*−3.931 dBB) and used a Quest adaptive procedure based on 1,024 trails/letter to locate the threshold size for each optotype set (see the Appendix for details). From earlier simulations, we determined that this noise value would approximate the human data for Sloan letters. We have repeated this simulation for each of the 200 eyes of the IAS. We also provide a demonstration in the Appendix to illustrate the process of estimating acuity for a given set of optotypes.

*N =*−3.931 dBB) was chosen to approximate the data for optotype set 1 (Sloan letters). The figure shows that, with one fixed noise value set to agree for set 1, set 2 is also accounted for, but the data move above the predictions for the more complex sets. In other words, the model accounts for some, but not all of the rise in threshold size with set number (or stroke frequency). Indeed the model actually predicts a decline in threshold from set 2 to set 3, in spite of an increase in stroke frequency. The separation between the two curves is a measure of the portion of the rise in threshold size, with set number, that is not accounted for by the model.

*N*for each set by fitting the model to the proportion correct data. To do this, for each set, observer, and eye, we selected a range of

*N*values. For each, we simulated 1,000 trials for each letter at each of the sizes used by Zhang et al. (2007) for that set. The error between model and data, defined as the log of the likelihood ratio, expressed as was computed for each noise value (a small constant was substituted for zero values of

*p*to prevent overflow). An interpolating function was then used to estimate the

_{model}*N*value yielding the minimum error. The accuracy of this method was confirmed by generating simulated data from the model with a known

*N*, and then estimating the value of

*N*. This yielded a total of 6 × 200 = 1,200 noise estimates for each set. The average of these 1,200 values is shown in Figure 9.

*N*obtained from the proportion correct, shown in Figure 9.

*N*, where

_{1}/N_{k}*N*is the estimated noise for set

_{k}*k*. The inverse of efficiency is plotted as black points in Figure 12. The normalized values range from 1 (for the Sloan letters, by definition) to between 1.4 and 2.7 for the Chinese characters.

*r*= 0.956). However, this unfiltered complexity does not match the measured efficiencies, which are nearly constant for the more complicated Chinese characters. But this flattening at higher set numbers is somewhat mirrored by the filtered complexities (Figure 12, red line).

*N*that maximized the correlation between model and data. This corresponds to a model with seven parameters, one for each optotype set. The average correlation for this optimized model is plotted against number of parameters in Figure 14 as the single black point. Liu et al. (2009) considered several variants of their model, differing in number of parameters. We plot their average correlations against number of parameters in Figure 14. It is evident that our model lies above the red curve, and thus fits better than a geometric moment model with a comparable number of parameters.

*Φ*is the cumulative distribution function of the standard normal density and

*E*is the energy of the neural image. We can compute values of

*E*using average threshold contrasts for a Gabor target from the ModelFest project (Watson & Ahumada, 2005), combined with the neural and optical transfer functions employed above. Using the ModelFest value of

*P*(

*c*) = 0.84, we can estimate the corresponding value of

*N*. We can repeat this exercise for each of the 200 eyes in the IAS, and produce a distribution of estimates of

*N*. This is shown in Figure 15 for Gabor functions of 4, 8, and 16 cycles/deg, each of constant one octave bandwidth. These are stimuli 12, 13, and 14 from ModelFest experiment (Watson & Ahumada, 2005). We also show as a red arrow the mean value estimated for the Sloan letters.

*N*values estimated in these two very different ways and provides further support for the acuity model proposed here.

*Journal of the Optical Society of America*.

*A, Optics and Image Science*7 (8), 1374– 1381. [PubMed] [CrossRef]

*American Journal of Optometry and Physiological Optics*

*,*57 (6), 378. [CrossRef]

*Vision Research*

*,*36 (22), 3723– 3733. [PubMed] [CrossRef]

*Spatial Vision*

*,*3 (3), 199– 224. [CrossRef]

*Contact Lens and Anterior Eye*

*,*28 (2), 75– 92. [CrossRef]

*Journal of Vision*, 4( 4): 7, 310–321, http://www.journalofvision.org/content/4/4/7, doi:10.1167/4.4.7. [PubMed] [Article] [CrossRef]

*Journal of the Optical Society of America*.

*A, Optics and Image Science*

*,*25 (8), 2078– 2087. [PubMed] [CrossRef]

*Journal of Vision*, 9( 7): 12, 1–16, http://www.journalofvision.org/content/9/7/12, doi:10.1167/9.7.12. [PubMed] [Article] [CrossRef]

*Optometry and Vision Science*

*,*80 (9), 650– 654.

*Ophthalmology*

*,*103 (1), 181. [CrossRef]

*Journal of Experimental Psychology*

*,*10 (5), 655– 666.

*Perception and Psychophysics*

*,*14 (3), 471– 482. [CrossRef]

*A Basic Program on Reading*(pp. 1– 20). Cooperative Research Program No. 639, Office of Education.

*Optometry and Vision Science*

*,*71 (1), 6– 13. [CrossRef]

*Proceedings, Meeting of American Academy of Optometry, 95521*.

*Journal of Vision*, 9( 1): 26, 1–18. http://journalofvision.org/9/1/26/, doi:10.1167/9.1.26. [PubMed] [Article] [CrossRef]

*Journal of Experimental Psychology: Human Perception and Performance*

*,*16 (1), 106. [CrossRef]

*Vision Research*

*,*42 (9), 1165– 1184. [PubMed] [CrossRef]

*Vision Research*

*,*39 (26), 4309– 4323. [CrossRef]

*Vision Research*

*,*39 (2), 367– 372. [PubMed] [CrossRef]

*Ophthalmologica*

*,*222 (3), 173– 177. [CrossRef]

*Journal of the Optical Society of America*.

*A, Optics and Image Science*

*,*10 (2), 201– 212. [PubMed] [CrossRef]

*Journal of the Optical Society of America*.

*A, Optics and Image Science*

*,*20 (7), 1371– 1381. [PubMed] [CrossRef]

*Perception*

*,*40 (ECVP Abstract Supplement), 15.

*Journal of the Optical Society of America*.

*A, Optics and Image Science*

*,*2 (9), 1508– 1532. [CrossRef]

*Vision Research*

*,*46 (28), 4646– 4674. [CrossRef]

*Journal of Optometry*

*,*1 (2), 65– 70. [CrossRef]

*Journal of the Optical Society of America*.

*A, Optics and Image Science*

*,*25 (10), 2395– 2407. [PubMed] [CrossRef]

*Ophthalmic & Physiological Optics: The Journal of the British College of Ophthalmic Opticians*

*,*15 (6), 601– 603. [PubMed] [CrossRef]

*Transactions of the IRE, PGIT-4*

*,*213– 221.

*Journal of the Optical Society of America*.

*A, Optics and Image Science*

*,*19 (12), 2329– 2348. [CrossRef]

*Applied Optics*31 (19), 3594– 3600. [CrossRef]

*Mathematica Journal, 14*, http://www.mathematica-journal.com/2012/02/perimetric-complexity-of-binary-digital-images/.

*Journal of Vision*, 5( 9): 6, 717–740. http://journalofvision.org/5/9/6/, doi:10.1167/5.9.6. [PubMed] [Article] [CrossRef]

*Journal of Vision*, 8( 4): 17, 1–19. http://journalofvision.org/8/4/17/, doi:10.1167/8.4.17. [PubMed] [Article] [CrossRef]

*Nature*

*,*302 (5907), 419– 422. [CrossRef]

*Society for Information Display Digest of Technical Papers*

*,*20, 360– 363.

*Perception & Psychophysics*

*,*33 (2), 113– 120. [CrossRef]

*SPIE Proceedings*

*,*3016, 2– 12.

*Journal of Vision*, 12( 10): 12, 1–16. http://journalofvision.org/12/10/12/, doi:10.1167/5.9.6. [CrossRef]

*Investigative Ophthalmology & Visual Science*

*,*49 (10), 4321– 4327, http://www.iovs.org/cgi/content/abstract/49/10/4321. [PubMed] [Article] [CrossRef]

*Investigative Ophthalmology & Visual Science*

*,*35 (3), 1132– 1137, http://www.iovs.org/cgi/content/abstract/35/3/1132. [PubMed] [Article]

*Perception and Psychophysics*

*,*36 (3), 225– 233. [CrossRef]

*Investigative Ophthalmology & Visual Science*, 48 (5), 2383– 2390, http://www.iovs.org/cgi/content/abstract/48/5/2383. [PubMed] [Article] [CrossRef]

^{1}Pelli et al. (2006) used a set of 26 Chinese characters, but only 10 Sloan letters. It is unknown how the set size might affect efficiency.

*K*. Let

*k*be the index of the optotype presented. The resulting neural image

*s*is corrupted by the addition of a noise image

_{k}*n*with standard deviation

*σ*. The ideal observer considers which of the template neural images

*s*is closest to the signal plus noise image

_{j}*s*; that is, it seeks the index

_{k}+ n*j*for which the following is minimized Expanding this expression for the distance, we have where ⊙ indicates the sum of the pixel-by-pixel product of the two images (the dot product of the two images regarded as vectors). Note that the last term is just an additive constant, that does not depend on the index

*j*. Thus minimizing the distance is equivalent to maximizing the quantity We call the quantity

*g*the discriminant. It is convenient to define

_{j}**C**as the

*K*×

*K*matrix of cross-correlations among the

*K*neural images, Then we can rewrite the discriminant as Note that the noise term

*s*⊙

_{j}*n*is a vector of length

*K*. When conducting Monte Carlo simulations, rather than constructing on each trial a new random image

*n*with possibly millions of pixels, it is sufficient to directly construct a vector of length

*K*, provided that its elements have the correct correlation. A method for constructing such a vector from the matrix

**C**and the noise standard deviation

*σ*is described in Watson and Ahumada (2008). We write that vector

**m**(

*σ,*

**C**). Finally, the discriminant can be written The observer locates the largest entry of

**g**, and returns its index

*j*as the index of the optotype identified. This algorithm corresponds to the behavior of an ideal observer of a signal known exactly.

^{2}. The display was viewed binocularly in a “dimly lit room.” Letter size was varied by changing the viewing distance, which ranged from 4.1 to 9.6 m. Display pixels were square with a width of 0.189 mm, yielding resolutions from 378.6 to 886.5 pixels/deg. The display subtended 2,048 by 1,536 pixels, so adapting field areas varied from 21.9 to 4 deg

^{2}. Using these values in our unified formula gives predicted pupil diameters of between 6.5 and 5.6 mm (Watson & Yellott, 2012). Accordingly, we used a pupil diameter of 6 mm in our simulations.

*D*is defocus in diopters,

*m*is the wavelength in micrometers, and

*p*,

*q*, and

*c*are parameters (

*p =*1.68524

*, q =*0.63346

*, c =*0.21410). This describes absolute defocus for an eye in focus at ∼589 nm. For an eye in focus at

*m*

_{0}, the defocus at other wavelengths

*m*will be This can be converted to a Zernike defocus coefficient in micrometers by the formula where

*d*is the pupil diameter in mm. This function is illustrated in Figure A1 for the case of focus at 555 nm and a 6 mm pupil.

*f*

_{0}= 33.3573,

*f*

_{1}= 5.37916,

*g*= 3.32619,

*loss*= 0.923853. We then multiplied this function by the OEF with standard parameters (Watson & Ahumada, 2005). This function was then normalized to a peak value of 1. The final result was then used to compute the NTF component of the NOTF. This function is illustrated in Figure A2.

**C**(Equation 2) for each of the 32 letter sizes used by Zhang et al. (2007). We simulate

*K*trials, one per optotype, at the middle size. Based on the model performance, Quest estimates the location of the acuity threshold, expressed as a likelihood function of the 32 letter sizes. The mode of the likelihood function is selected as the new size, and another

*K*trials are presented. We again estimate a new location for threshold, and this process continues until

*T*are completed for each letter. The resulting data are then fit by a normal distribution function of log size, using a maximum likelihood method (Watson, 1979), to yield an estimate of the letter size yielding the target probability correct. The Quest method provides a highly efficient way of estimating acuity from the model.

*T*, the target probability, and the Quest jitter. The last value is the width of a uniform distribution from which a number is drawn that is added to the selected test location at each step, in order to spread out the trials over more of the psychometric function. In actual use, this parameter was set to zero.

*σ*to describe the standard deviation of the independent normally distributed noise samples added to each pixel in the simulated neural image. From the point of view of simulation, this is a convenient parameter. However, because this noise is in the domain of the neural image, the value estimated for a given set of data will depend on (a) the spatial resolution of the simulation, and (b) the normalization of the neural transfer function.

*dx*and

*dy*are the width and height of a single pixel. As an example, the estimate of

*σ*for Sloan letters derived from confusion data is 0.257908. In the simulation,

*dx*=

*dy*= 0.00264993 deg (0.158996 arcmin). Thus,