**The contrast sensitivity function (CSF) has shown promise as a functional vision endpoint for monitoring the changes in functional vision that accompany eye disease or its treatment. However, detecting CSF changes with precision and efficiency at both the individual and group levels is very challenging. By exploiting the Bayesian foundation of the quick CSF method (Lesmes, Lu, Baek, & Albright, 2010), we developed and evaluated metrics for detecting CSF changes at both the individual and group levels. A 10-letter identification task was used to assess the systematic changes in the CSF measured in three luminance conditions in 112 naïve normal observers. The data from the large sample allowed us to estimate the test–retest reliability of the quick CSF procedure and evaluate its performance in detecting CSF changes at both the individual and group levels. The test–retest reliability reached 0.974 with 50 trials. In 50 trials, the quick CSF method can detect a medium 0.30 log unit area under log CSF change with 94.0% accuracy at the individual observer level. At the group level, a power analysis based on the empirical distribution of CSF changes from the large sample showed that a very small area under log CSF change (0.025 log unit) could be detected by the quick CSF method with 112 observers and 50 trials. These results make it plausible to apply the method to monitor the progression of visual diseases or treatment effects on individual patients and greatly reduce the time, sample size, and costs in clinical trials at the group level.**

^{2}) to create similar AULCSF changes in this study (0.14, 0.29, and 0.43 log unit; see Results). Similar settings were also used in other studies (Dorr et al., 2013; Kooijman, Stellingwerf, van Schoot, Cornelissen, & van der Wildt, 1994).

^{2}. The resolution was 1920 × 1080 pixels, and the vertical refresh rate was 60 Hz. A bit-stealing algorithm was used to achieve 9-bit grayscale resolution (Tyler, 1997). Participants viewed the display binocularly from a distance of 4 m in a dark room.

^{2}, respectively. Compared with the L condition, there were 7.8 and 36.4 folds of luminance change in the M and H conditions, respectively. The magnitudes of luminance change were comparable with those used in previous studies (Dorr et al., 2013; Kooijman et al., 1994).

*f*denotes the radial spatial frequency,

*f*

_{0}= 3 cycles per object is the center frequency of the filter, and

*f*

_{cutoff}= 2

*f*

_{0}was chosen such that the full bandwidth at half height is one octave. The pixel intensity of each filtered image was normalized by the maximum absolute intensity of the image. After normalization, the maximum absolute Michelson contrast of the image is 1.0. Stimuli with different contrasts were obtained by scaling the intensities of the normalized images with corresponding contrast values. Stimuli with different spatial frequencies were generated by resizing (Figure 1b). There were 128 possible contrasts (evenly distributed in log space from 0.002 to 1) and 19 possible spatial frequencies (evenly distributed in log space from 1.19 to 30.95 cycles per degree [cpd]). The narrow band filtered letters can provide assessment of contrast sensitivity in different central spatial frequencies that are equivalent to that with sinewave gratings (Alexander et al., 1994; McAnany & Alexander, 2006).

**Figure 1**

**Figure 1**

*g*

_{max}, peak spatial frequency

*f*

_{max}, bandwidth at half height

*β*(in octaves), and low frequency truncation level

*δ*(Lesmes et al., 2010; Watson & Ahumada, 2005). In the rest of the article, we use

*CSF parameters*interchangeably with

*truncated log parabola parameters*unless otherwise stated. Unlike many conventional methods that select stimuli adaptively only in the contrast space, the quick CSF method selects optimal stimuli in the two-dimensional contrast and spatial frequency space (Figure A1) by maximizing the information gain about the to-be-measured CSF in each trial. Using a Bayesian adaptive algorithm to select the optimal test stimulus prior to each trial and update the posterior probabilities of CSF curve parameters following the observer's response, the quick CSF method directly estimates the entire CSF curve instead of contrast thresholds or contrast sensitivities at discrete spatial frequencies (see Appendix A for more details).

*H1*and

*H2*in the rest of the article. In each test block, the quick CSF procedure with a 10-alternative forced-choice letter identification task was used to measure the CSF in 50 trials. Each observer finished one experimental session, which included six distinct quick CSF runs, in approximately 70 min.

*incorrect*. No feedback was provided. All three responses were used to update the posterior distribution of the CSF parameters (see Appendix A). A new trial started 500 ms after the responses.

*p*(

_{t}*θ*), and used to construct 1,000 CSF curves. Each CSF curve was represented by contrast sensitivities sampled at 19 spatial frequencies ranging from 1.19 to 30.95 cpd, evenly distributed in log space. We then obtained the empirical distributions of the CSF from these 1,000 CSF curves.

*p*(

_{t}*θ*) in the same way. The AULCSF is a summary measure of spatial vision (Applegate, Howland, Sharp, Cottingham, & Yee, 1998; Lesmes et al., 2010; Oshika, Okamoto, Samejima, Tokunaga, & Miyata, 2006) and was calculated as the area under log CSF curve (and above zero) in the spatial frequency range of 1.5 to 18 cpd (American National Standards Institute, 2001; Montes-Mico & Charman, 2001; Pesudovs et al., 2004).

*P*

^{−1}() is the empirical inverse cumulative distribution function of the posterior for CSF or AULCSF. Because CSF was sampled at 19 spatial frequencies, the average HWCI of CSF across 19 spatial frequencies was reported for each individual.

**Figure 2**

**Figure 2**

*F*(2, 222) = 774.6,

*p*< 0.001. The AULCSFs were significantly different in the L, M, and H conditions,

*F*(2, 222) = 1517.2,

*p*< 0.001, but the CSFs and AULCSFs in the H1 and H2 conditions overlapped and were not different:

*F*(1, 111) = 0.670,

*p*= 0.415;

*F*(1, 111) = 0.646,

*p*= 0.423. These results show that the quick CSF method was able to capture the CSF differences induced by the luminance manipulation and suggested that the method had high test–retest reliability.

**Figure 3**

**Figure 3**

**Figure 4**

**Figure 4**

*D̄*(see Equation B1 in Appendix B for definition) between the two repeated measures of the CSF measured in the H1 and H2 conditions for all 112 observers. The mean of the

*D̄*distribution was 0.07 ± 0.03 log unit after 50 trials. The distribution of

*D̄*(Figure B2) and its relationship with trial number and HWCI (Figure B3) are presented in Appendix B. Generally, HWCI and

*D̄*are quite comparable.

**Figure 5**

**Figure 5**

*a*and Δ

*a*represent AULCSF and AULCSF difference, respectively;

*p*

_{difference}() is the distribution of the AULCSF difference, and

*p*

_{1}(·) and

*p*

_{2}(·) are the posterior distributions of AULCSF in the two conditions. This posterior inference should be regarded as being conservative (i.e., with greater variance) because it does not reflect the possible a priori correlation of the two AULCSFs. The posterior distributions of the AULCSF difference for H1–H2, H–L, M–L, and H–M after 50 trials of observers S14, S26, S86, and S107 are shown in Figure 6.

**Figure 6**

**Figure 6**

*Sensitivity*is defined as the probability of reporting a change when there is a real condition change.

*Specificity*is defined as the probability of declaring no change when there is no change (i.e., H1–H2). By definition, the specificity corresponding to the 95% criterion credible interval is 95%.

**Figure 7**

**Figure 7**

**Figure 8**

**Figure 8**

*N*(

*N*= 2, 3, . . . , 112), we randomly selected

*N*observers from the total sample of 112 observers with replacement and performed power analyses for paired

*t*-test based on the observed standard deviation of the AULCSF of the subset of observers. The average effect sizes of AULCSF change (over 112 observers)—0.43, 0.29, and 0.14 for H–L, M–L, and H–M, respectively—were considered as the true effect sizes. The statistical power for detecting AULCSF change for H–L, M–L, and H–M with

*α*= 0.05 was calculated. The procedure was repeated 1,000 times, and the average power was taken as the estimated power of the quick CSF method in detecting respective group mean changes. Figure 9 shows the estimated power as a function of the number of observers and the number of quick CSF trials as heat maps for different changes.

**Figure 9**

**Figure 9**

*p*> 0.05). We used the Brown–Forsythe test for variance equality for the distribution of AULCSF differences in M–L, H–M, and H1–H2 comparisons. There was no difference found in the standard deviation of the AULCSF differences in H1–H2, M–L, and H–M (all

*p*> 0.05 except for trial 1; Figure 11a). It suggests that the standard deviation of the AULCSF difference was independent of the baseline level of AULCSF differences observed between conditions. We calculated the average standard deviation across the three comparisons and plotted it against trial number in Figure 11b. This value was used in the following analyses.

**Figure 10**

**Figure 10**

**Figure 11**

**Figure 11**

*α*= 0.05 and power = 0.95 was computed for a given number of observers,

*N*(

*N*= 2, 3, . . . , 112) based on the estimated standard deviation in Figure 11b. In Figure 11c, the effect size is plotted as a function of number of observers and trials. Generally, with more observers and trials, the effect size that is required for the quick CSF method to detect an AULCSF change shrinks. To detect 0.2, 0.1, and 0.05 log unit of AULCSF changes with 20 quick CSF trials, we would need 8, 25, and 93 observers, respectively. To detect the same amounts of difference in 50 trials, only 4, 9, and 28 observers would be needed. With 20 observers, we would need 7, 23, and more than 50 trials to detect 0.2, 0.1, and 0.05 log unit of AULCSF changes, respectively. With 112 observers, we needed only 1, 6, and 15 trials to detect the same amount of difference, respectively. In fact, we could detect a difference of less than 0.025 log unit (6% difference) with 112 observers and 50 trials.

**Figure 12**

**Figure 12**

*Journal of the Optical Society of America A: Optics, Image Science, and Vision*, 11, 2375–2382.

*American National Standard for Ophthalmics—Multifocal intraocular lenses*. Fairfax, VA: Optical Laboratories Association.

*Journal of Refractive Surgery*, 14, 397–407.

*ARVO Meeting Abstracts*, 56, 3901-D0043.

*Eye (London)*, 18, 809–813.

*Graefe's Archive for Clinical and Experimental Ophthalmology*, 241, 968–974.

*Ophthalmic and Physiological Optics*, 11, 218–226.

*Optometry & Vision Science*, 83, 290–298.

*Vision Research*, 42, 2137–2152.

*Statistical models in epidemiology*. Oxford, UK: Oxford University Press.

*Acta Ophthalmologica (Copenhagen)*, 57, 679–690.

*Proceedings of the National Academy of Sciences, USA*, 110, 4368–4373.

*Clinical applications of visual psychophysics*(pp. 70–106). Cambridge, UK: Cambridge University Press.

*International Ophthalmology Clinics*, 43 (2), 5–15.

*Spatial Vision*, 11, 121–128.

*Journal of General Physiology*, 20, 831–850.

*Clinical applications of visual psychophysics*(pp. 11–41). Cambridge, UK: Cambridge University Press.

*Vision Research*, 17, 1049–1055.

*Graefe's Archive for Clinical and Experimental Ophthalmology*, 245, 1805–1814.

*Vision Research*, 47, 22–34.

*ARVO Meeting Abstracts*, 55, 762.

*Journal of Cataract & Refractive Surgery*, 15, 141–148.

*Proceedings of the National Academy of Sciences, USA*, 111, 2035–2039.

*Perception & Psychophysics*, 14, 313–318.

*Archives of Ophthalmology*, 118, 1187–1194.

*Vision Research*, 37, 1595–1604.

*Perception*, 36, 14.

*Archives of Ophthalmology*, 106, 55–57.

*Low vision*(pp. 101–110). Lansdale, PA: IOS Press.

*Emotion*, 14, 978–984.

*Investigative Ophthalmology & Visual Science*, 54, 2762. [Abstract]

*ARVO Meeting Abstracts*, 53, 4358.

*Optometry & Vision Science*, 91, 956–965.

*Statistical theories of mental test scores*. Reading, MA: Addison-Wesley.

*Annals of Ophthalmology*, 13, 1069–1071.

*British Journal of Ophthalmology*, 70, 553–559.

*Ophthalmology*, 95, 139–143.

*Evidence-based laboratory medicine: Principles, practice and outcomes*(2nd ed., pp. 53–66). Washington, DC: AACC Press.

*Vision Research*, 46, 1574–1584.

*Journal of Refractive Surgery*, 17, 646–651.

*Documenta Ophthalmologica*, 103, 175–186.

*PLoS ONE*, 7 (3), e34441.

*International Ophthalmology*, 28, 407–412.

*Ophthalmology*, 113, 1807–1812.

*Ophthalmology Clinics of North America*, 16, 171–177.

*Vision Research*, 23, 689–699.

*British Journal of Ophthalmology*, 88, 11–16.

*ARVO Meeting Abstracts*, 56, 2225-B0078.

*Spatial vision*(pp. 239–249). Boca Raton, FL: CRC Press.

*Journal of Vision*, 14 (10): 1428, doi:10.1167/14.10.1428. [Abstract]

*Archives of Ophthalmology*, 128, 1576–1582.

*ARVO Meeting Abstracts*, 56, 2224-B0077.

*Journal of Ophthalmic & Vision Research*, 5, 175–181.

*Vision Research*, 28, 1235–1246.

*Journal of the Optical Society of America A*, 5, 2181–2190.

*Evaluation of diagnostic systems: Methods from signal detection theory*. New York, NY: Academic Press.

*Spatial Vision*, 10, 369–377.

*Journal of Cataract & Refractive Surgery*, 35, 47–56.

*Eye*, 21, 218–223.

**1. Define a CSF functional form.**

*τ*(

*f*) is the reciprocal of contrast sensitivity

*S*(

*f*), which is described by the truncated log parabola with four parameters (Lesmes et al., 2010; Watson & Ahumada, 2005; Figure A1): where

*θ*= (

*g*

_{max},

*f*

_{max},

*β*,

*δ*) represents the four CSF parameters: peak gain (

*g*

_{max}), peak spatial frequency (

*f*

_{max}), bandwidth at half height (

*β*, in octaves), and low frequency truncation level (

*δ*).

**2. Define the stimulus and parameter spaces.**The application of Bayesian adaptive inference requires two basic components: (a) a prior probability distribution,

*p*(

*θ*), defined over a four-dimensional space of CSF parameters

*θ*, and (b) a two-dimensional space of possible letter stimuli with contrast

*c*and spatial frequency

*f*. In our simulation study, the ranges of possible CSF parameters were 2 to 2,000 for peak gain, 0.2 to 20 cpd for peak frequency, 1 to 9 octaves for bandwidth, and 0.02 to 2 for truncation. The ranges for possible grating stimuli were 0.2% to 100% for contrast

*c*and 1.19 to 30.95 cpd for frequency

*f*. Both parameter and stimuli spaces were sampled evenly in log units.

**Figure A1**

**Figure A1**

**3. Priors.**Before the beginning of the experiment, an initial prior,

*p*

_{t}_{= 0}(

*θ*), which represents the knowledge about the observer's CSF before any data are collected, was defined by a hyperbolic secant function with the best guess of parameters

*θ*

_{i}_{,guess}and width of

*θ*

_{i}_{,confidence}for

*i*= 1, 2, 3, and 4 (King-Smith & Rose, 1997; Lesmes et al., 2010): where sech(

*x*) = 2/(e

*+ e*

^{x}^{−}

*);*

^{x}*θ*=

_{i}*g*

_{max},

*f*

_{max},

*β*, and

*δ*for

*i*= 1, 2, 3, and 4, respectively;

*θ*

_{i}_{,guess}= 100, 2, 3, and 0.5 for

*i*= 1, 2, 3, and 4, respectively; and

*θ*

_{i}_{,confidence}= 2.48, 3.75, 7.8, and 3.12 for

*i*= 1, 2, 3, and 4, respectively.

**4. Bayesian adaptive inference.**In trial

*t*, the observer makes three responses: = “correct” or “incorrect” corresponding to three letters

*x*

_{i}= (

*c*

_{i},

*f*) with contrast

*c*

*and spatial frequency*

_{i}*f*, where

*i*= 1, 2, and 3 represents the

*i*th letter from left to right. After observer's (three) responses are collected in trial

*t*, knowledge about CSF parameters

*p*(

*θ*) is updated. The outcome of trial

*t*is incorporated into a Bayesian inference step that updates the knowledge about CSF parameters

*p*

_{t}_{−1}(

*θ*) prior to trial

*t*: where is the posterior distribution of parameter vector

*θ*after obtaining a response

*r*at trial

_{x}*t*; is the percentage correct psychometric function given stimulus

*x*; and ;

_{i}*p*

_{t}_{−1}(

*θ*) is our prior about

*θ*before trial

*t*, which is also the posterior in trial

*t*− 1.

**5. Stimulus search.**To increase the quality of the evidence obtained on each trial, the quick CSF calculates the expected information gain for all possible stimuli

*x*: where

*h*(

*p*) = −

*p*log(

*p*) − (1 −

*p*)log(1 −

*p*) is the information entropy of the distribution

*p*. Before each trial, we find out the candidate stimuli that correspond to the top 10% of the expected information gain over the entire stimulus space. Then we randomly pick one among those candidates as

*x*(

_{t}=*c*,

*f*) for presentation. In this way, the quick CSF avoids large regions of the stimulus space that are not likely to provide useful information to the current knowledge about

*θ*. To improve observers' experience in CSF testing, two additional letters with higher contrasts are presented alongside the optical test letter

*x*. From left to right, their contrasts are 4

_{t}*c*, 2

*c*, and

*c*, respectively. The maximum contrast is capped at 90%. Their spatial frequency is

*f*.

**6. Reiteration and stopping rule**. The quick CSF procedure reiterates steps 4 and 5 until 50 trials have been run.

**7. Analysis**. After step 6, we obtain the posterior distribution in numerical form of CSF parameters

*p*(

_{t}*θ*) (see Figure A2 for the marginal prior and posterior distributions for the four CSF parameters). A resampling procedure is used that samples directly from the posterior distributions of the CSF parameters and generates the CSF estimates (i.e., CSF and AULCSF) based on all the CSF samples. The procedure automatically takes into account the covariance structure of the CSF parameters in the posterior distribution and allows us to compute the credible interval of the estimates derived from CSF functions.

**Figure A2**

**Figure A2**

**Figure B1**

**Figure B1**

*τ*

_{i}_{1}and

*τ*

_{i}_{2}are estimated CSFs (i.e., posterior means) at the

*i*th (

*i*= 1, 2, . . . , 19) spatial frequency in the H1 and H2 conditions, respectively. is a correction coefficient to make the distance essentially the same as the standard deviation of estimated CSF in two repeated measurements.

*D̄*between the two repeated measures of the CSF measured in the H1 and H2 conditions after 5, 10, 20, and 50 trials for all 112 observers. The mean of the distribution of

*D̄*was 0.22, 0.16, 0.11, and 0.07 log unit after 5, 10, 20, and 50 trials, respectively. The width of the distributions narrowed as trial number increased: The standard deviation of the mean distance decreased from 0.23 to 0.03 as trial number increased from 1 to 50. It should be noted that the variability of the estimated standard deviation was greater than that of HWCI. This is likely because there were only two repeated measurements of the CSF in the H condition. Running more repeated tests would reduce this variability.

**Figure B2**

**Figure B2**

**Figure B3**

**Figure B3**

*SE*is the standard error of the estimated CSF,

*SD*

_{sample}is the standard deviation of the pooled CSFs of all observers in the H1 and H2 conditions, and

*r*is the average correlation coefficient between the two repeated measures. The standard error is also plotted as a function of trial number in Figure B3. The standard error is virtually identical to the other two variability measures.