Ideally, hypotheses could be tested in random-effects models, in which not the individual effect sizes, but their distributions in the population are estimated. Unfortunately, this approach, which involves numerical evaluation of integrals or Markov Chain Monte Carlo simulation (e.g., Tuerlinckx, Rijmen, Verbeke & De Boeck,
2006), would be unreasonably computationally expensive for the models with a larger number of parameters. We will rely on some other meta-analytical principles instead. First, consider the hypothesis of before that a threshold parameter
μ in a psychometric function is 0. In the testing procedure, we would fit both the unconstrained and the constrained (
μ = 0) model, per subject
j, where
j = 1 to
J. The maximum likelihoods for both models would be transformed in
J × 2 deviances. Under the null hypothesis, the differences between these, Δ
Dj =
DM0,j −
DM1,j approximately follow a
χdf=12 distribution. Distribution theory states that the sum of
χ2-distributed variables is also
χ2 distributed, with as shape parameter the sum of degrees of freedom of the constituent distributions. In short, Δ
D ∼
χdf=k2 ⇒
Δ
Dj ∼
χdf=Jk2. This is a very useful result to establish a “global effect” criterion: if the
aggregated deviance exceeds the .95 quantile in a
χdf=Jk2 distribution, a “global effect” exists. Except for serving as a very welcome summary of the global pattern of observers' data, the combination of deviance differences provides a large increase in statistical power compared to the individual fits. On the other hand, this does not allow us to discard the individual deviance differences or
p-values from our discussion. It is very well possible that one of our volunteers is an outlier and is subject to a completely idiosyncratic effect. In this case, the global deviance difference will be largely constituted by the deviance difference of one single individual; in other words, while most of the deviance differences would be in the “body” of the
χdf=k2 distribution, one would find one deviance difference far in the “tail”. A case like this is easily discovered by jackknifing: leaving out each participant in turns, re-calculating the aggregated deviance and re-evaluating global significance. Another useful tool to diagnose this type of situation is the quantile plot, in which obtained deviance differences are plotted against their ordered expected values in a sample of size J from a
χdf=k2 distribution. One can also plot the
p-values, which should approximately be uniformly distributed in the interval [0, 1] if the null-hypothesis is true. If all deviance-difference points cluster very much toward one side, there is a global effect. If all points are scattered tending, in median, toward the middle, there is a global absence of effect. If all points are reasonably close to the middle of the theoretical distribution, but one approaches boundary or is situated far in the tail, we have an outlier. If a group is near median, and a group near the extreme, the participants are subdivided in two groups.