Maximum likelihood estimation (MLE) has been used to produce perceptual scales from binary judgments of triads and quadruples. This method relies on Thurstone’s theory of a stochastic perceptual process where the perceived difference of two stimuli is the difference in their perceived strengths. It is possible that the perception of a suprathreshold difference is overestimated when adding smaller differences, a phenomenon referred to as diminishing returns. The current approach to construct a perceptual scale using MLE does not account for this phenomenon. We present a way to model the perception of differences using MLE and Thurstone’s theory, adapted to allow the possibility of diminishing returns. This method is validated using Monte Carlo simulated responses to experimental triads and can correctly model diminishing returns, the absence of diminishing returns, and the opposite of diminishing returns both in the cases when a perceptual scale is known and when the true perceived strengths of the stimuli are unknown. Additionally, this method was applied to empirical data sets to determine its feasibility in investigations of perception. Ultimately, it was found that this analysis allows for more accurate modeling of suprathreshold difference judgments, a more complete understanding of the perceptual processes underlying comparisons, and the evaluation of Thurstone’s theory of difference judgments.

*Thurstonian scaling*, states that the perception of a stimulus is normally distributed about its true perceived strength. This extends to the perceived differences of stimuli as the difference of normally distributed variables is also normally distributed. However, the universality of this assumption across different perceptual variables has not been sufficiently validated and cannot be currently verified using tools that assume Thurstonian scaling.

*z*-score of the proportion of responses choosing one test over the other. Consider a triad arrangement of three stimuli, a standard and two tests referred to as the method of triads (MOT). These will be referred to as \(S_{stand}, S_{test1}, S_{test2}\). The participant is asked to determine which test is

*more different*from the standard. Under Thurstonian scaling assuming Thurstone Case V, the perception of these stimuli, \(S\), are normally distributed variables centered on their true perceived strength, \(\psi\), with a variance, \(\sigma ^2\), which Thurstone names

*discriminal dispersion*,

*additivity of differences*. This additivity is inherent to the Thurstonian model even though it may or may not exist in reality. More important, relying on an analysis methodology that assumes additivity precludes evaluation of its existence.

*diminishing returns*, similar to a second-order Weber–Fechner law. This behavior can be modeled as a function that takes in a difference in \(\psi\) and returns the perceived difference using a concave function. If the addition of small differences underestimates the perception of a large difference, this would be a case of

*increasing returns*. This would indicate that a scaling function would be convex. Illustrative scaling functions for the cases of diminishing returns, additivity of differences, and increasing returns are shown in Figure 1.

*optim*, in \({\mathsf R}\) (R Core Team, 2021). This function minimizes, so the final form of the objective function is the negative log-likelihood (NLL),

*t*-test, which is conveniently built into the regression in R. The hypothesis testing is meant as a preliminary analysis as it makes the strong assumption that the perceptual scale is well known. This assumption is likely to fail, even if a perceptual scale has previously been estimated assuming additivity. However, in application, other statistical tests can be used to determine whether the regression suggests nonadditivity such as a likelihood ratio test.

*S*, \(d_{JND} = 1.25, 2.75, 4.25, 5.75\), will be considered a JND. Translating these values to the discriminal dispersion using the \(z\)-score (i.e., the inverse of the cumulative Gaussian (4)), for the 75% percentile,

- • Concave (circle): \(f(x) = \sqrt{1 - (x - 1)^2}\),
- • Concave (root): \(f(x) = \sqrt{x}\),
- • Concave (sine): \(f(x) = \sin (x)\),
- • Convex: \(f(x) = x^2\),
- • Linear: \(f(x) = x\).

*must*fall under the maximum) while the latter is a soft boundary (all values that are too large become so improbable, the likelihood tends to zero for individual triads). Therefore, it is possible that setting one parameter over the other may produce an estimation closer to the ground truth. For that reason, MLE models that set the maximum, as is the case in the \({\mathtt{MLDS}}\) package (Knoblauch et al., 2008), as well as models that set the standard deviation were evaluated. If there was an analytic solution to the MLE, this would be irrelevant. However, due to the reliance on numerical optimization, particularly gradient descent, there is a chance that the gradient would be altered based on which parameter is set. As a result, there may be more local minima in which the optimizer would get “trapped,” leading to suboptimal and less stable fits.

- P1: The significance testing can correctly identify if the underlying function is concave, convex, or linear.
- P2: The MLE can model underlying functions.
- P3: NLL is related to the RMSE of the underlying function and approximated function.

- • Concave: \(g(x) = \sqrt{x}\),
- • Convex: \(g(x) = x^2\),
- • Linear: \(g(x) = x\).

- P4: The ability of the MLE to approximate \(\hat{f}(x)\) will decrease with a nonlinear \(g(x)\).
- P5: The RMSE will be related to the accuracy across all levels of \(g(x)\).

*P4: the ability of the MLE to approximate \(\hat{f}(x)\) will decrease with a nonlinear \(g(x)\)*holds. This is demonstrated by inspection of Figure 9 and the lower RMSE of the purple points in Figure 10. The final prediction,

*P5: The fit between \(f(x)\) and \(\hat{f}(x)\) will be related to the accuracy across all levels of \(g(x)\)*, is also supported.

*z*-score of the 75th percentile. The value that a JND should correspond to in the perceptual scale can be selected, but the results suggest that too large of a value may result in suboptimal results.

*Journal of Vision,*17(1), 37, https://doi.org/10.1167/17.1.37. [CrossRef]

*Journal of the Optical Society of America A,*33(3), A30–A36. [CrossRef]

*Proceedings of the National Academy of Sciences,*119(18), e2119753119. [CrossRef]

*Softstat,*97, 67–74.

*Psychometrika,*35(3), 283–319. [CrossRef]

*Journal of the Optical Society of America A,*24(11), 3418–3426. [CrossRef]

*Journal of Vision,*12(3), 19, https://doi.org/10.1167/12.3.19. [CrossRef]

*SIAM Journal on Numerical Analysis,*13(2), 261–268. [CrossRef]

*SIAM Journal on Numerical Analysis,*17(2), 238–246. [CrossRef]

*The Journal of Psychology,*44(2), 311–318. [CrossRef]

*Journal of Statistical Software,*25(2), 1–26. [CrossRef]

*Foundations of measurement: Vol. I: Additive and polynomial representations*. San Diego, California; London, England: Academic Press, Inc.

*Journal of Mathematical Psychology,*4(2), 226–245. [CrossRef]

*Psychometrika,*29(2), 115–129. [CrossRef]

*Annual Review of Vision Science,*6(1), 519–537. [CrossRef]

*Journal of Vision,*3(8), 5, https://doi.org/10.1167/3.8.5. [CrossRef]

*The Journal of Abnormal and Social Psychology,*52(1), 57. [CrossRef]

*Journal of Mathematical Psychology,*47(1), 90–100. [CrossRef]

*Journal of Vision,*16(9), 2, https://doi.org/10.1167/16.9.2. [CrossRef]

*R: A language and environment for statistical computing*. Vienna, Austria: R Foundation for Statistical Computing.

*Psychological Review,*34(4), 273. [CrossRef]

*Psychometrika,*17(4), 401–419. [CrossRef]

*Journal of Vision,*17(4), 1, https://doi.org/10.1167/17.4.1. [CrossRef]

*Behavioral Science,*12, 498.