**Abstract**:

**Abstract**
In their influential paper, Wichmann and Hill (2001) have shown that the threshold and slope estimates of a psychometric function may be severely biased when it is assumed that the lapse rate equals zero but lapses do, in fact, occur. Based on a large number of simulated experiments, Wichmann and Hill claim that threshold and slope estimates are essentially unbiased when one allows the lapse rate to vary within a rectangular prior during the fitting procedure. Here, I replicate Wichmann and Hill's finding that significant bias in parameter estimates results when one assumes that the lapse rate equals zero but lapses do occur, but fail to replicate their finding that freeing the lapse rate eliminates this bias. Instead, I show that significant and systematic bias remains in both threshold and slope estimates even when one frees the lapse rate according to Wichmann and Hill's suggestion. I explain the mechanisms behind the bias and propose an alternative strategy to incorporate the lapse rate into psychometric function models, which does result in essentially unbiased parameter estimates.

*intensity*, though this may not be an appropriate term in many circumstances (e.g., the variable may be spatial or temporal frequency, orientation offset, etc.). A generic formulation of the psychometric function is given by: (e.g., Wichmann & Hill, 2001, Kingdom & Prins, 2010). Though discredited, the classic high-threshold detection model (e.g., Swets, 1961) provides for an intuitively appealing interpretation of the parameters of Equation 1a. Under the high-threshold model, F(

*x*;

*α*,

*β*) describes the probability of detection by an underlying sensory mechanism as a function of stimulus intensity

*x*,

*γ*corresponds to the guess rate (the probability of a correct response when the stimulus is not detected by the underlying sensory mechanism), and

*λ*corresponds to the lapse rate (the probability of an incorrect response, which is independent of stimulus intensity). Several forms of F(

*x*;

*α*,

*β*) are in common use such as the Logistic function, the Weibull function, and the cumulative normal distribution. In this paper, the Weibull function is used exclusively and is given by: The parameter

*α*of F

_{W}(x;

*α*,

*β*) determines the function's location and is commonly referred to as the function's ‘threshold.' The parameter

*β*determines the rate of change of performance as a function of stimulus intensity

*x*and is commonly referred to as the ‘slope.'

*α*and

*β*were equal to 10 and 3, respectively. The guess rate

*γ*was 0.5. The generating lapse rate

*λ*was systematically varied from 0 to 0.05 in steps of 0.01. The method of constant stimuli (MOCS) utilizing seven different stimulus placement regimens was used. The seven stimulus placement regimens are shown in Figure 1 (s1 through s7) relative to the generating form of F. The total number of simulated trials (N) in each simulated experiment was evenly distributed among the six stimulus intensities in each of the placement regimens. Each simulated dataset was then fitted with the psychometric function in Equation 1 using a maximum-likelihood criterion. The threshold and slope parameters were free to vary during the fitting process. The lapse rate parameter was either held constant at a fixed value or was allowed to vary within the interval [0 0.06]. This prior

^{1}was placed on the lapse rate parameter to reflect beliefs regarding likely values of the lapse rate parameter. Unless the prior is applied, nonsensical negative estimates of the lapse rate might result, as well as unrealistically high estimates of the lapse rate.

*α*and

*β*, but rather in terms of $ F 0.5 \u2212 1 $ [that is, the stimulus intensity at which function F

_{W}(Equation 1b) evaluates to 0.5] and $ F 0.5 ' $ (that is, the gradient or first derivative of F evaluated at $ F 0.5 \u2212 1 $). The true, generating, values of these quantities are 8.85 and 0.118 respectively. Figure 2 was taken from Wichmann and Hill (2001). It shows the median threshold and median slope estimates, each derived based on 2,000 simulated experiments. The light symbols show the estimates when the lapse rate was fixed at zero, the darker symbols show the estimates when the lapse rate was allowed to vary. The different shapes of the symbols in the figure indicate stimulus placement regimen and correspond to the symbols used in Figure 1. The true (generating) values of the threshold (in terms of $ F 0.5 \u2212 1 $) and slope (in terms of $ F 0.5 ' $) are indicated by the horizontal lines.

*β*= 1 ( $ F 0.5 ' $ = 0.050 when

*α*= 10) and

*β*= 10 ( $ F 0.5 ' $ = 0.360 when

*α*= 10). A uniform prior across these parameter values was used. The range of possible stimulus intensities the psi-method could select from included 21 values spaced logarithmically between $ F 0.1 \u2212 1 $ ( = 4.72) and $ F 0.999 \u2212 1 $ ( = 19.04). The psi-method assumed a Weibull function, with lapse rate equal to 0.025 and a guess rate equal to 0.5. Note that the choice for assumed lapse rate and its correspondence to the generating value affect directly only the exact stimulus intensities used in the simulations, not the parameter estimates I report here as these are derived based on a maximum likelihood criterion in a separate procedure. The psi-method was implemented using the Palamedes toolbox (Prins & Kingdom, 2009). In order to provide a general idea as to the placement of stimuli when the psi method is used, the stimulus placements combined across 10,000 simulations where the generating lapse rate equaled 0.03 and the number of trials was 960 is included in Figure 1.

_{68}/2’) are also shown in Figure 4. These are simply half the distance between the 16

^{th}and 84

^{th}percentile in the distribution of parameter estimates. Insofar as these distributions are normally distributed, these values are comparable to standard errors of estimate. Figure 5 shows scatterplots of parameter estimates obtained with a free lapse rate for s1, s6, s7, and psi-controlled placement regimens using a generating lapse rate of 0.03 and N = 960. Full distributions of parameter estimates in the form of histograms and scatterplots will be produced by the code that accompanies this paper for any of the simulations performed in this paper as well as any of the conditions in Wichmann & Hill's (2001) Figure 3 (reproduced here as Figure 2).

*λ*at values other than zero. Systematic and significant biases are observed in $ F 0.5 \u2212 1 $ as well as $ F 0.5 ' $. The magnitude of bias depends primarily on the difference between the value of the generating lapse rate and the value assumed during the fit. In line with observations made by Klein (2001), fixing the lapse rate at a small (but greater than zero) value avoids the excessive biases in slope found when the lapse rate is fixed at a value of zero.

*all*placement regimens (including those not shown here) lead to positively biased threshold estimates. Bias in slope estimates is small and consistent across placement regimens but varies systematically with generating lapse rate. From Figure 5 it is clear that the lapse rate estimate is correlated with both the threshold and slope parameters.

*λ*

_{gen}= 0.03) but tend to overestimate the generating value when the lapse rate is high.

*α*= 10,

*β*= 3 [corresponding to $ F 0.5 \u2212 1 = 8.85 $ and $ F 0.5 ' = .118 ] , $

*γ*= 0.5) with a lapse rate equal to 0 (black curve). A hypothetical data set is shown by the black symbols in the figure. When the lapse rate is allowed to vary within [0 0.06], the best-fitting PF is the red curve in the figure. Its estimate of the threshold parameter

*α*equals 9.29, its estimate of the slope parameter

*β*equals 3.38 and its estimate of the lapse rate parameter

*λ*equals 0.06 (i.e., the upper limit on the lapse rate's prior). These values correspond to $ F 0.5 \u2212 1 = 8.33 $ and $ F 0.5 ' = 0.140 $. As is clear from Figure 3c, despite having dissimilar parameter values, the generating curve and the best-fitting (red) curve are virtually identical

*within the range covered by the s1 regimen*(which spans $ F 0.3 \u2212 1 = 7.09 $ through $ F 0.7 \u2212 1 = 10.64 $). It is important to note, however, that these curves are similar only in terms of probability of a positive (e.g., ‘correct') response (i.e.,

*ψ*in Equation 1). The functions describing the underlying perceptual process (in which we are interested; F in Equation 1) are quite different, as shown in the Figure inset.

*λ*= 0, all PFs in the family bound by the two functions shown in Figure 3c are, within the tested range, virtually identical to the generating PF. Likewise, any dataset generated under the s1 placement regimen will have an entire family of PFs associated with it that will all have likelihoods very near the maximum in the likelihood function. Indeed, from Figure 3b, which shows the likelihood function across threshold and slope values for the hypothetical dataset shown in Figure 3c, it is clear that the likelihood function lacks a distinct peak but instead has a ridge corresponding to the family of PFs bound by the two functions shown in Figure 3c. Note that the ridge occurs because the lapse rate is free to vary (the value of the lapse rate of the PF with the highest likelihood is indicated by the color code). Note also that the extent of the ridge is constrained by the limits placed on the lapse rate: The ridge would extend farther in both directions if the prior on the lapse rate would allow it.

*λ*= 0 under placement regimen s1 with number of trials equal to N = 960. Figure 7a shows the distribution of lapse rate estimates for the 10,000 simulations. Nearly all lapse rate estimates (9,358 of the 10,000, or 93.6%) are at the limits of the prior, with about an equal number at each end of the prior (N = 4,885 at $\lambda \u02c6$ = 0, N = 4,473 at $\lambda \u02c6$ = 0.06). Figure 7b shows a histogram of all 10,000 threshold estimates. From Figure 7b we note that thresholds are clearly biased (the generating threshold value is indicated in the figure by the triangle). The bias in threshold estimates is closely linked to the observed distribution of lapse rate estimates. Figure 7c shows the threshold estimates for the 4,885 simulations in which the lapse rate estimate equaled zero. For these simulations the lapse rate estimate was accurate (albeit mostly accidentally so, as I will argue) and we find that for this subset of simulations, the threshold estimates are unbiased. Effectively, these 4,885 simulations were fitted by a PF at the ‘correct' limit of the family of PFs that would all fit these simulations about equally well.

*λ*= 0.06 (cf. red curve in Figure 3c). The same pattern of results is observed in the median slope estimates. For the fits in which the lapse rate estimate was equal to zero, the median of slope estimates $ F\u02c6 \u2032 0.5 $ was equal to 0.118 (compare to the slope of the generating Weibull $ F \u2032 0.5 = 0.118 $). However, for the fits in which the lapse rate estimate was 0.06, the median of slope estimates $ F\u02c6 \u2032 0.5 $ was equal to 0.139 which corresponds closely to the slope of the red curve shown in Figure 3c ( $ F\u02c6 \u2032 0.5 $ = 0.140). Finally, in Figure 7e are shown the threshold estimates for the (relatively few) remaining simulations. The lapse rate estimates for these simulations are between those for the simulations in Figure 7c and 7d and so is the median threshold estimate for this subset of simulations.

*λ*= 0.05. The leftmost scatterplot in Figure 5 shows the relationships among parameter estimates observed under placement regimen s1 in a different manner (the generating lapse rate in the figure equaled 0.03).

*only*with F having a value near unity

*and*the lapse rate having a value near 0. As a result, when the generating lapse rate is zero and a placement regimen which includes a high stimulus intensity is used, lapse rate estimates will be at or near the (‘correct') value of 0. Correspondingly, bias in threshold and slope will be minimal.

*either*to a high lapse rate

*or*to a low value of F (or some combination of these two factors). Stated more precisely, a relatively high number of incorrect responses at the high stimulus intensity will be consistent with a relatively broad family of functions, members of which will display a wide range of lapse rate values. The manner in which the three parameters trade off when placement regimen includes a high stimulus intensity is very apparent from Figure 5 (middle two panels: s6 and s7): High lapse rate estimates tend to go with threshold and slope estimates that combine to produce high perceptual performance (i.e., high F) at the high stimulus intensity (i.e., low threshold/high slope). Similarly, low lapse rate estimates tend to go with high threshold/shallow slope estimates. It might be noted in passing that, contrary to intuitive appeal perhaps, increasing the number of trials at the highest stimulus intensity will not do anything to resolve this ambiguity. That is, a proportion of, say, 5% incorrect responses at a high intensity will be consistent with either a low value of F or a high lapse rate regardless of the number of observations it is based on. The ambiguity must instead be resolved by obtaining accurate estimates of threshold and slope parameters through observations made at the lower stimulus intensities. Returning to our argument, the bias observed when the generating lapse rate is high arises because of the asymmetry of the window of allowed lapse rate estimates relative to a high generating lapse rate. Whereas the window allows those functions which have lapse rate values that are much

*lower*than the generating value (which are coupled with upward biased threshold estimates), it does not allow those with lapse rate estimates that are (much)

*higher*than the generating value. Overall, then, threshold estimates are biased upward when the generating lapse rate is high.

*λ*

_{gen}= 0 and 52.6% of estimates equal 0 when

*λ*

_{gen}= 0.05). This finding seems to extend an observation made by Kaernbach (2001). Kaernbach demonstrated (and meticulously argued) that a bias in slope estimates results when an adaptive method is used that selects stimulus intensities such as to optimize measurement of the threshold but not that of the slope parameter. He further demonstrated that the bias in slope estimates is remedied when an adaptive method is used that selects stimulus intensities to optimize measurement of the slope as well as the threshold (see also Kontsevich & Tyler, 1999). The high degree of bias in lapse rate estimates (and the closely linked bias in threshold and slope parameter estimates) obtained here may thus be a result of the fact that the adaptive method selects stimulus intensities such as to optimize threshold and slope estimates, but not the lapse rate estimate. The general rule appears to be that unless an adaptive procedure optimizes stimulus selection for the estimation of a specific parameter, caution should be exercised when that parameter is subsequently estimated from the resulting observations.

*a*is an API. In effect, errors made at

*x*=

*a*will be unambiguously attributed to lapses. Note that under jAPLE, observations made at intensities other than

*x*=

*a*also contribute to the estimation of the lapse rate.

*α*and

*β*(see Equation 1). The metrics used by Wichmann and Hill have the advantage of allowing numerical comparison of parameter values across different forms of F (Weibull, Logistic, etc.). However, $ F 0.5 \u2212 1 $ and $ F 0.5 ' $ are both non-linear functions of both

*α*and

*β*, and it is the values of

*α*and

*β*which are estimated in the maximum-likelihood estimation procedure. Thus, while maximum-likelihood estimators have the desirable property of being asymptotically unbiased (e.g., Edwards, 1972), this property would apply only to

*α*and

*β*, not to $ F 0.5 \u2212 1 $ and $ F 0.5 ' $. Moreover, since $ F 0.5 \u2212 1 $ and $ F 0.5 ' $ are both functions of both

*α*and

*β*, any bias in

*either α or β*would result in bias for

*both*of $ F 0.5 \u2212 1 $ and $ F 0.5 ' $. Since my results directly challenge the integrity of Wichmann and Hill's results I have chosen to report my results in terms of $ F 0.5 \u2212 1 $ and $ F 0.5 ' $ also. However, my pattern of results would be the same whether expressed in terms of $ F 0.5 \u2212 1 $ and $ F 0.5 ' $ or in terms of

*α*and

*β*. That is, like $ F 0.5 \u2212 1 $ and $ F 0.5 ' $,

*α*and

*β*are both biased when estimated by the method proposed by Wichmann and Hill and, like $ F 0.5 \u2212 1 $ and $ F 0.5 ' $,

*α*and

*β*are both not (noticeably) biased when the methods I propose are used and the number of observations is sufficient.

*and*slope. Fifth, when an API stimulus is included one should consider using the proposed jAPLE or iAPLE fitting method. It is important to realize, however, that these methods may not always be possible to implement since the maximum achievable stimulus intensity may not be at asymptotic performance. Great care should be taken to ensure that performance has indeed reached an asymptotic level at the stimulus intensity chosen as API.

*. Baltimore, MD: Johns Hopkins University Press.*

*Likelihood*

*The Spanish Journal of Psychology**,*8 (2), 256–289. [CrossRef] [PubMed]

*Spatial Vision**,*20 (1–2), 5–43. [CrossRef] [PubMed]

*Perception & Psychophysics**,*63 (8), 1389–1398. [CrossRef] [PubMed]

*. London: Academic Press: An imprint of Elsevier.*

*Psychophysics: A practical introduction*

*Perception and Psychophysics**,*63 (8), 1421–1455. [CrossRef] [PubMed]

*Vision Research**,*39 (16), 2729–2737. [CrossRef] [PubMed]

*Vision Research**,*25 (9), 1245–1252. [PubMed] [CrossRef] [PubMed]

*Vision Research**,*35 (17), 2503–2522. [CrossRef] [PubMed]

*Perception and Psychophysics**,*61 (1), 87–106. [CrossRef] [PubMed]

*Perception and Psychophysics**,*63 (8), 1293–1313. [CrossRef] [PubMed]

*Behavior Research Methods**,*38 (1), 28–41. [CrossRef] [PubMed]

^{1}Wichmann and Hill (2001) refer to the interval of allowed lapse rates as a ‘Bayesian prior.' Constraining the lapse rate estimates to an interval that reflects the subjective belief concerning the likely values of the lapse rate does indeed embody a critical feature of Bayesian reasoning. However, the estimation of parameter values in Wichmann and Hill and here is performed by (constrained) maximum-likelihood estimation, not by Bayesian estimation. For that reason I will refer to the interval of allowed lapse rates simply as the ‘prior window' or the ‘prior.'