A key question in perception research is how stimulus variations translate into perceptual magnitudes, that is, the perceptual *encoding* process. As experimenters, we cannot probe perceptual magnitudes directly, but infer the encoding process from responses obtained in a psychophysical experiment. The most prominent experimental technique to measure perceptual appearance is matching, where observers adjust a probe stimulus to match a target in its appearance along the dimension of interest. The resulting data quantify the perceived magnitude of the target in physical units of the probe, and are thus an indirect expression of the underlying encoding process. In this paper, we show analytically and in simulation that data from matching tasks do not sufficiently constrain perceptual encoding functions, because there exist an infinite number of pairs of encoding functions that generate the same matching data. We use simulation to demonstrate that maximum likelihood conjoint measurement (Ho, Landy, & Maloney, 2008; Knoblauch & Maloney, 2012) does an excellent job of recovering the shape of ground truth encoding functions from data that were generated with these very functions. Finally, we measure perceptual scales and matching data for White’s effect (White, 1979) and show that the matching data can be predicted from the estimated encoding functions, down to individual differences.

*S*), and measure the corresponding output, that is, the behavioral response (

*R*). If the chosen stimulus dimension is relevant to visual perception, there should be a lawful relationship between input and output, namely, between stimulus and response. These stimulus-response functions characterize the system in mathematical terms (

*R*=

*f*(

*S*)), and they serve as empirical target for theoretical and computational models of perception. This is the psychophysicist’s approach to “peer into” the black box.

*S*) in Figure 1), which we infer from observable behavior (verbal reports or button presses). The psychophysical characterization of perception in terms of observable responses (

*R*=

*f*(

*S*)) involves two putative processes (Figure 1; adapted from Gescheider, 1997, Figure 12.7). The perceptual process captures the translation of stimulus variations into perceptual magnitudes (Ψ =

*f*

_{1}(

*S*)). It has been called transducer function in the study of near-threshold vision (e.g., Kingdom & Prins, 2016) and stimulus transformation function or psychophysical law in the study of supra-threshold vision (Gescheider, 1997; Gescheider, 1988). We refer to it as perceptual encoding. The second process involves the translation of a perceptual magnitude into a behavioral response (

*R*=

*f*

_{2}(Ψ)). It has been called response transformation function, sensory-response law (Gescheider, 1988; Gescheider, 1997), or readout. We refer to it as perceptual decoding. The overall stimulus-response function is thus a composition of perceptual encoding and decoding (

*R*=

*f*

_{2}○

*f*

_{1}see Figure 1).

^{1}to elucidate the implicit assumptions about encoding and decoding processes in matching tasks. We show analytically and in simulation that data from matching tasks do not sufficiently constrain perceptual encoding functions because there exist an infinite number of pairs of encoding functions that generate the same matching data. We then use simulation to demonstrate that maximum likelihood conjoint measurement (Ho, Landy, & Maloney, 2008; Knoblauch & Maloney, 2012) does an excellent job of recovering the shape of ground truth encoding functions from data that were generated with these very functions. Finally, we measure perceptual scales and matching data for White’s effect (White, 1979), and show that the matching data can be predicted from the estimated encoding functions, down to individual differences.

*vertical*distance between the two encoding functions (vertical line in Figure 2c).

*t*) in the black phase of White’s stimulus. Using the target encoding function (black curve in Figure 3a) a lightness value (Ψ

_{B}(

*t*)) is assigned to the luminance of the target. To produce a match, the observer adjusts the probe luminance (

*p*, Figure 3a red line) such that the perceived lightness of the target (in black) and the probe (in white) are identical (Ψ

_{B}(

*t*) ≡ Ψ

_{W}(

*p*)). The probe luminance (

*p*) is found by inversely reading out the encoding function in the white phase (Ψ

_{W}).

*horizontal*difference between target and probe encoding functions in units of luminance for the same ordinate value (Ψ(

*t*) ≡ Ψ(

*p*), green line in Figure 3a). It does not provide the

*vertical*difference between the encoding functions which captures the perceived lightness difference for equiliminant targets.

*t*

_{1},

*t*

_{2}, .... Unfortunately, as illustrated in Figure 3c, different pairs of encoding functions can produce the same set of matching data. In fact, any pair of encoding functions for which the horizontal distance at each ordinate position is identical, will produce identical matching data. For functions from the power family this holds true for all pairs with the same ratio of their exponents (see Appendix A for analytical derivation). Thus matching data do not sufficiently constrain the putative encoding functions, and hence do not allow to characterize the perceptual magnitudes (Ψ

_{B}(

*t*) and Ψ

_{W}(

*p*)). In realistic experimental settings this under-determinancy is exacerbated by two factors: noise in sensory events and a selective sampling of matches around a point of maximum difference. We illustrate these two factors in simulation (see Appendix B).

*f*

_{2}). Therefore, the measured perceptual scales should be empirical estimates of the underlying encoding functions. We evaluate this claim in simulation and experiment.

_{W}) and one for the target in the black phase (Ψ

_{B}), as in Figure 2c. The functions are defined as power functions of the form Ψ

_{B}(

*s*) =

*s*

^{α}and Ψ

_{W}(

*s*) =

*s*

^{β}where the exponents α, β > 0. In White's (1979) stimulus, targets in the black phase appear brighter than targets in the white phase; therefore, α < β. To test the capability of MLCM to recover the putative encoding functions we use pairs of ground truth encoding functions with different exponents (different shapes), but the same exponent ratio. To cover a wide range of function shapes, we varied α between 0.25 and 2.0, and β between 0.5 and 4.0.

*c*

_{1}and

*c*

_{2}) with a particular luminance (

*s*

_{1}and

*s*

_{2}). The contexts can be identical (

*c*

_{1}=

*c*

_{2}both targets in the black or both in the white phase, as in Figure 4a) or different (

*c*

_{1}≠

*c*

_{2}one target in the black and one in the white phase, as in Figure 4b). The simulated observer derives two perceptual magnitudes \(\Psi _{c_1}(s_1)\) and \(\Psi _{c_2}(s_2)\) which correspond to the luminance value on the respective encoding function (Ψ

_{B}(

*s*

_{1}) for targets presented in the black phase and Ψ

_{W}(

*s*

_{1}) otherwise). To decide which target is brighter, a decision variable δ is computed as the difference between the two perceptual magnitudes:

*N*(0, σ

^{2})). The simulated observer performs a binary decision. If δ < 0, they choose the first stimulus; if not, the second. We simulated noise with σ values of 0.03, 0.06, and 0.15. These values correspond with the minimum, average, and maximum noise observed in a previous experiment (Aguilar & Maertens, 2020).

*a single internal dimension*(lightness), (2) variability on the internal dimension (noise) is fixed across the scale, (3) the functions that map luminance to lightness are different between the two contexts, and (4) some comparisons must be difficult, so that in some trials δ is small (see Appendix C for an explanation). Although assumptions one to three are a priori assumptions, assumption four depends on the domain under study, that is, the shape of the encoding function, the amount of noise, and the chosen stimulus levels. We simulated the experiment with different parameters of the ground truth functions and different noise levels, and used ten stimulus levels across the possible contrast range.

*s*)) with

*N*= 10 luminance values.

*s*) is from 0 to 1 and we interpret this value as average error in percent.

^{−2}, and the maximum luminance, corresponding with white phase, was 490 cdm

^{−2}, producing a Michelson contrast of 0.98. The grating was centered on a neutral gray background of 95 cdm

^{−2}.

^{−2}, and targets in the white phase were 11, 22, 38, 66, 126, 188, 250, 311, 377, and 446 cdm

^{−2}. These actual luminances were measured at the target positions with the full stimulus on the display. Thus, the reported values match what participants saw during the experiment.

*S*-shape: approximately linear at intermediate values and accelerating towards the ends of the luminance range. Consequently, the scales approach each other at the low and high ends of the range, and bulge away from each other for intermediate luminance values. This shape difference suggests that two isoluminant targets are perceived maximally different at intermediate luminance values, and the magnitude of White's (1979) effect decreases towards the extremes of the luminance range. Appearance matches are often gathered for intermediate target luminance values, and not across the whole range of luminances spanned by the surround context. This makes it harder to compare the shapes of the scales reported here to previously reported measurements of White's (1979) effect. Vincent (2017) reported similar variation in effect magnitude as a function of target luminance in a matching paradigm. Rather than varying the target luminance Lin, Chen, and Chien (2010) varied the contrast of the grating while keeping target luminance constant. They found match contrast decreasing with increasing surround contrast, which could be in line with the same overall shapes of brightness scales described here: lower contrast surrounds would compress the domain of encoding functions, reducing the intermediate range of luminance values where effect magnitude is maximal. The maximal effect magnitude at intermediate values may make a matching task and thus data collection easier, which in turn could be a practical reason why previous measurements have focused here.

*S*-shape was reported: steepening of the relationship near both the pedestal luminance as well as the background luminance. Whittle (1992) reported this as the “crispening” effect: an enhancement of brightness differences near background and pedestal luminances, which also appears in brightness discrimination (Whittle, 1986).

*perceptual encoding*functions. These encoding functions (or estimates thereof) can be thought to underlie both the pairwise comparison data collected in the MLCM experiment, as well as the appearance matches in the matching task. Both our simulations and our empirical data show that brightness encoding functions estimated using MLCM can predict brightness matches from the same participants to within the variability of the matches. These predictions are reliable for each participant and capture the idiosyncratic variations observed in the matches. This good congruence between matches and scales is strong evidence that both tasks tap into the same perceptual encoding mechanisms.

*vs.*low luminance context regions, and so on. Differences in effect magnitude measured with matching may result from any of these “trivial” stimulus parameters. While perceptual encoding functions do not solve this problem, they provide more robustness, since we are not comparing single effect magnitudes, but rather a whole relationship between stimulus values and perceptual magnitudes. We may test whether the shape of the functions fundamentally differs between effects, or whether differences are limited to the range, local slope etc. Hence, comparing encoding functions provides more information about the potential relationship between different perceptual effects.

*The New Cognitive Neurosciences*(2nd ed., Vol. 3, pp. 339–351). MIT Press.

*Journal of Vision,*20(4), 19, doi:10/ggxrpf. [CrossRef] [PubMed]

*Journal of Vision,*22(2), 2, doi:10.1167/jov.22.2.2. [CrossRef] [PubMed]

*Philosophical foundations of neuroscience*(2rd ed.). Wiley-Blackwell.

*Journal of Vision,*15(11), 14, doi:10.1167/15.11.14. [CrossRef] [PubMed]

*Vision Research,*39, 4361–4377, doi:10/fwcgkk. [CrossRef] [PubMed]

*Vision Research,*127, 11–17, doi:10/f857w6. [CrossRef] [PubMed]

*Journal of the Optical Society of America A,*40(3), A99, doi:10.1364/JOSAA.475040. [CrossRef]

*Frontiers in Human Neuroscience,*9(February), 1–16, doi:10.3389/fnhum.2015.00093. [PubMed]

*Vision Research,*44(15), 1765–1786, doi:10.1016/j.visres.2004.02.009. [CrossRef] [PubMed]

*Elemente der Psychophysik*(2nd ed.). Breitkopf und Härtel.

*Tutorial essays in psychology*. 2. Erlbaum.

*Annual Reviews of Psychology,*39, 169–200, doi:10.1146/annurev.ps.39.020188.001125. [CrossRef]

*Psychophysics*(3rd ed.). Psychology Press.

*Vision Research,*51(13), 1397–1430, doi:10/c7c6xn. [CrossRef] [PubMed]

*Signal detection theory and psychophysics*. John Wiley.

*Psychological Science,*19(2), 9. [CrossRef]

*Vision Research,*51(7), 652–673, doi:10/djchb3. [CrossRef] [PubMed]

*Psychophysics: A practical introduction*(2nd ed.). Elsevier/Academic Press.

*Modeling psychophysical data in R*. Springer.

*MLCM: Maximum Likelihood Conjoint Measurement*, https://doi.org/10.1007%2F978-1-4614-4475-6.

*Journal of the Optical Society of America,*30(12), 617–645.

*R: A language and environment for statistical computing*. R Foundation for Statistical Computing.

*Journal of Open Source Software,*8(86), 5321, doi:10.21105/joss.05321.

*Journal of Mathematical Psychology,*24(1), 21–57, doi:10.1016/0022-2496(81)90034-1.

*American Journal of Psychology,*69(1), 1–25.

*Quarterly Journal of Experimental Psychology,*16(4), 387–391, doi:10.1080/17470216408416400.

*Partial independence of brightness induction and brown inducton suggests a two-stage model for brightness induction*. Doctoral (PhD) Dissertation, University of Washington.

*Vision Research,*26(10), 1677–1691, doi:10.1016/0042-6989(86)90055-6. [PubMed]

*Vision Research,*32(8), 1493–1507, doi:10.1016/0042-6989(92)90205-W. [PubMed]

*Journal of Vision,*19, doi:10/f88dg7.

*Journal of Vision,*14(7), 3–3, doi:10.1167/14.7.3. [PubMed]

_{B}≔

*s*→ Ψ

_{B}(

*s*) and one for the target “in” white Ψ

_{W}≔

*s*→ Ψ

_{W}(

*s*).

*p*) as a function of manipulated target luminance (

*t*). To do that, we further assume that Ψ

_{B}is invertible (i.e., one-to-one and onto) and apply it to both sides of the equation

*t*and

*p*as the

*composition*of \(\Psi ^{-1}_B\) with Ψ

_{W}. Analogously, when the target is “in” black and the probe “in” white, we obtain the complementary function \(p = (\Psi ^{-1}_W \circ \Psi _B) (t)\).

_{W}≔

*x*→

*x*

^{α}and Ψ

_{B}≔

*x*→

*x*

^{β}, with α, β > 0. Let the target be “in” white and the probe “in” black. The inverse of the probe encoding function is \(\Psi ^{-1}_B := \psi \rightarrow \psi ^{\frac{1}{\beta }}\). Applying the same logic as above, we have that the probe relates to the target such that:

_{W}) and one for the target in the black phase (Ψ

_{B}). Encoding functions are defined as power functions of the form Ψ

_{B}(

*s*) =

*s*

^{α}and Ψ

_{W}(

*s*) =

*s*

^{β}where the exponents α, β > 0. In White's (1979) stimulus targets in the black phase appear brighter than targets in the white phase, therefore α < β (e.g., Figure A1a).

*t*, the perceptual variable (ψ

_{t}) is thus calculated as a sample from a Gaussian random distribution:

_{T}is the encoding function for the target (Ψ

_{B}or Ψ

_{W}, for targets in the black or the white phase, respectively) and the parameter σ = 0.05 denotes the amount of simulated noise.

*single*perceptual scale. It assumes that variations on a single stimulus dimension

*s*map to variations on a

*single*perceptual dimension Ψ(

*s*). MLCM is designed to estimate

*more than one*perceptual scale. It assumes that stimulus appearance along one perceptual dimension is determined by several physical dimensions. In our case we assume that perceived target lightness is affected by the luminance and the context of a target. MLCM estimates perceptual scales which map variations in luminance to variations in perceived lightness, and these scales differ between contexts. To take the internal stochasticity of human judgments into account MLDS and MLCM include a noise parameter in the model. Noise is assumed to be constant across the perceptual dimension and is estimated along with the scale(s).

*s*

_{3}) − Ψ(

*s*

_{2})) − (Ψ(

*s*

_{2}) − Ψ(

*s*

_{1})) + ϵ for an MLDS trial where three stimuli need to be compared. Alternatively, the mapping to the internal dimension itself can be noisy, and hence the same stimulus magnitude would be associated with a distribution of perceptual magnitudes as illustrated by the Gaussian curves on the y-axis in Figure A4. Because noise is assumed to be Gaussian and of equal variance along the internal dimension, both scenarios are equivalent.

*s*

_{9}) − Ψ(

*s*

_{4}) or Ψ(

*s*

_{4}) − Ψ(

*s*

_{2}) is perceived as bigger. Judgments are repeated several times to obtain a relative frequency of choosing one interval or the other. This frequency indicates the perceived magnitude of the intervals. A frequency of 0.5 indicates that both intervals are perceived to be approximately equal. A frequency close to 1 would indicate that the second interval is perceived as larger than the first. In the example in Figure A4a, the relative frequency is approximately 0.65, so the second interval is judged to be slightly greater than the first. By repeatedly presenting different triads and putting them all in the same statistical model, MLDS estimates the parameters for Ψ(

*s*) that best explain the data (maximize the likelihood of the data).

*intervals*. These judgments contain easy and difficult comparisons, and hence produce relative frequencies across the entire range from ceiling (

*p*= 0 or

*p*= 1) to guessing (

*p*= 0.5).

*s*

_{4}and

*s*

_{5}(black context). The stimuli elicit perceptual responses Ψ

_{B}(

*s*

_{5}) and Ψ

_{B}(

*s*

_{4}). The comparison is repeated several times and produces a relative frequency of judging

*s*

_{5}lighter than

*s*

_{4}of

*p*= 0.75. The second comparison is between

*s*

_{2}and

*s*

_{3}(white context).

*s*

_{3}is perceived lighter than

*s*

_{2}with a relative frequency of

*p*= 0.9. These two examples illustrate how the response frequencies determine the shape of each scale: the local slope of Ψ

_{B}needs to be shallower than that of Ψ

_{W}to produce a smaller perceptual interval between two stimuli (less discriminable). MLCM also includes comparisons across contexts. An example is shown in the right panel of Figure A4b. When

*s*

_{4}is shown in black and

*s*

_{7}in white, observers judge

*s*

_{7}as lighter with a relative frequency of

*p*= 0.75. Because this frequency is the same as for the “within pair” (

*s*

_{4}and

*s*

_{5}) in the black context, the perceptual interval should be of the same size. Considering all within- and across-context comparisons in the same statistical model, MLCM estimates scale values for which these interval relationships are preserved.