Box 3. Analysis methods for serial dependence
The analysis of serial dependence involves quantifying systematic relationships between the decision on the current trial and the history of events on preceding trials. The exact pipeline depends on the type of task and data (e.g., adjustment responses or forced choices). We here provide a brief description of standard approaches used in different contexts.
Adjustment tasks
For simplicity, we will use orientation adjustment tasks as an example, but the analyses described here equally apply to color, direction, or any other continuous stimulus.
Data preparation. Adjustment errors are computed as the angular (or linear) difference between the reported and the actual orientation in each trial. Typically, adjustment errors are widely scattered. To correct for outlier errors and unlikely reports, a correction is often performed by removing trials according to an arbitrary cutoff (e.g., 30°; Cicchini et al., 2018), standard deviation–based thresholds (e.g., 3 SD; Fritsche et al., 2017), or other parametric approaches (e.g., Grubb's test; Pascucci et al., 2019). Adjustment times are also used to clean the data, for instance, by removing trials with responses faster than 500 ms or slower than 10 s (Ceylan et al., 2021). In some cases, the trial following an outlier (e.g., a guess) is also removed, since no meaningful serial effects can be expected following a trial in which the observer had a lapse of attention or guessed the response. An additional procedure is to demean the cleaned errors to remove systematic tendencies to reproduce the stimulus in one direction or another (e.g., CW or CCW). Because adjustment errors are prone to other sources of biases, including, for instance, anisotropies and category boundary effects in orientation estimation, some authors have usually corrected for nonlinearities in the stimulus space (Manassi et al., 2018; Pascucci et al., 2019; van Bergen & Jehee, 2019). This has been done with different methods, from fitting sinusoidal to polynomial functions on the errors over stimulus space, to cleaning the data of biases, such as the cardinal or orientation bias and category biases in color space. Standard and validated approaches for this step have yet to be reached.
A largely unaddressed issue concerns the influence of swap errors on serial dependence. Swap errors refer to an observer reporting the wrong item in a working memory task. In serial dependence, an observer might report the previous instead of the current target, simply because of a “swap” rather than a combination of the prior and current percept. Almeida, Barbosa, and Compte (2015) reported that in a typical working memory task, swaps and attractive bias co-occur. As swap errors appear as a 100% bias, even a few swaps could create the appearance of spurious serial dependence.
Model fitting. The magnitude of the serial dependence in adjustment errors is computed by fitting the first derivative of a Gaussian function (abbreviated as DoG) to the errors as a function of Δ (the difference between the previous stimulus feature and the current one). The typical form of this function is
\begin{eqnarray*}y = \Delta awc{e^{ - {{\left( {w\Delta } \right)}^2}}}\end{eqnarray*}
where
y is the adjustment error (single trial or averaged over trials and smoothed over ∆); α is the amplitude of the DoG curve multiplied by the constant \(c = \sqrt 2 /{e^{ - 0.5}}\), which scales the amplitude to the curve peak in y units (e.g., degrees); and
w is the inverse of the curve width. Note that, besides approximating the main pattern, the form of this function also reflects two important aspects of serial dependence, which can be decomposed into a linear component Δ
a, which implies a systematic relationship between errors and previous stimuli, and a Gaussian weighting component, centered on 0, which accounts for the fading of the relationship as Δ increases.
The most used parameter to determine the magnitude of serial dependence is the amplitude or half-amplitude of the DoG function (α). While this is usually estimated on the aggregate data of many subjects, to avoid the pitfalls of aggregated data, the DoG fitting procedure can be expanded to allow for variability between individual participants by fitting a mixed-effects model (Pascucci et al., 2019). Another modeling approach is to fit a hierarchical Bayesian model to the data (Sadil et al., 2021). In this model, the authors accounted for individual differences and rotational biases as well.
As mentioned throughout the article, serial dependence patterns also contain a combination of attractive and repulsive components, an aspect that simple DoG curves fail to capture. Alternative functions have been proposed to overcome this limitation (Bliss et al., 2017). Because of the putative nature of these opposite biases, a plausible approach would be to use the difference between two Gaussian functions, one accounting for the positive and one for the negative component (see Figures 3A,B)—that is, the classic “Mexican-hat” profile. This, however, would come at the expense of model complexity.
Model-free approaches. An alternative approach to model fitting is to compute the average error in a range of Δ values, typically close to zero. For example, subtracting the average error for Δ in the 1° to 25° range from the average error in the corresponding negative range (Samaha et al., 2019) quantifies the amount of systematic deviation of errors from zero—this deviation or “bias” is positive for attractive serial dependence and negative for repulsion. This approach is a straightforward way of quantifying serial dependence with few assumptions, particularly useful when limited data points are available, which is often the case in analyses for a single observer. Restricting the analysis to values of Δ close to zero is reasonable when serial dependence effects are expected in this range. A disadvantage of this approach, which also applies to the DoG fitting procedure described above, is that it does not allow capturing more complex patterns across the entire Δ range. While attractive biases are typically present for small Δ, repulsive biases appear for larger Δ. One way to address this problem would be to use more than a single bin, discretizing Δ as a function of several distances between the previous and present stimulus (e.g., three to four bins from “close” to “far”), then analyzing the data with standard repeated-measures approaches.
Statistical analysis
Statistical testing is commonly performed using a permutation approach on the parameter of interest, typically the half-amplitude α of the DoG function (Fischer & Whitney, 2014; Fritsche et al., 2017). A null distribution of α is obtained by fitting the DoG function to a large number of randomized data sets generated by randomly flipping the sign of errors in each trial (or shuffling the correspondence between single-trial Δ and errors). The proportion of such data sets that is more extreme than the parameters from the DoG function fitted to actual data denotes p, which can be interpreted in the same way as the p-value in standard frequentist statistics. Comparisons between conditions can be performed using a similar permutation approach by shuffling the labels of the conditions for each randomized data set. This approach has been often used for aggregated data from multiple subjects. In the framework of nonlinear mixed-model analysis, the final estimate of the parameters and their uncertainty can be used to quantify statistical significance (Pascucci et al., 2019).
Forced choice
When the task involves a choice between two alternatives, results are typically analyzed using a psychometric function, in which stimulus value predicts response. Serial dependence can be quantified as a shift in the threshold or point of subjective equality (PSE) of the psychometric function. For example, in comparing which one of two stimuli is tilted more clockwise, the PSE corresponds to the point of “no difference.” A shift in the PSE can be informative of whether the perception of the current stimulus has been affected by a preceding inducer, with attractive and repulsive biases that depend on the direction of the shift (Fritsche et al., 2017). Beyond PSE, more sophisticated statistical models have been also proposed to quantify or account for serial biases in psychophysical forced-choice tasks (Gekas et al., 2019).