We performed four hierarchical Bayesian data analyses: Model-1 for the fixed-physical-size session of
Experiment 1, Model-2 for the fixed-projected-size session of
Experiment 1 (
Figure 6), Model-3 for the closest-distance conditions in both sessions of
Experiment 1 (
Figure 7), and Model-4 for
Experiment 2 (
Figure 10). Except for a minor modification involving Model-3 that will be clarified below, all four models used the same structure and the same priors as specified diagrammatically in
Figure A1.
It is useful to conceptualize this graphical structure as a Bayesian analog of one-way ANOVA. The experimental design matrix is counterbalanced across participants
i, experimental conditions
k, and replications
j as indicated by the nested plates in
Figure A1. Each of the 10 polyhedral objects was presented three times for a total of 30 replications per condition. The four models were individuated by their experimental conditions: In Model-1 and Model-2 these were the three viewing distances for the reference object. Model-3 had only two conditions: fixed physical size (
k = 1) versus fixed projected size (
k = 2). Finally, the conditions in Model-4 corresponded to the three sizes of the reference object in
Experiment 2. One behavioral observation was collected per trial—the log relative aspect ratio (cf.
Equation 1 in the main text). This quantity is referred to as “adjustment” throughout this
Appendix. It enters the Bayesian model via the random variable
yijk in the innermost plate. It is assumed that all observations for a given participant
i and a given condition
k are drawn independently from a Gaussian distribution with mean μ
ik and standard deviation σ
ik (intermediate plate in
Figure A1). To accommodate individual differences, the model includes idiosyncratic means and standard deviations in each condition.
The hierarchical structure of the model is designed to accommodate both individual differences and commonalities within a single condition at the same time. The individual-level parameters that govern the distribution of the observable data reflect individual differences. They are sampled in turn from group-level distributions governed by group-level parameters, which reflect commonalities across participants within a given condition. For example, consider the individual differences in the variability of adjustments across the 30 replications in a given condition that are manifested in the unequal widths of the error bars in
Figures 5 and
9. These individual differences are modeled by the random variable σ
ik whose natural logarithm λ
ik is sampled from a group-level Gaussian distribution with group-level parameters
\({\mu ^{{\lambda _k}}}\) and σ
λ. More importantly, there are individual differences in overall adjustment level—the individual profiles in
Figures 5 and
9 “float” up and down relative to each other. The model accounts for them by partitioning each individual-level mean (μ
ik) into two parts:
\begin{equation}{\mu _{ik}} = {\beta _i} + {\theta _{ik}}\end{equation}
where β
i is the grand mean of Participant
i’s adjustments across all experimental conditions, and θ
ik is Participant
i’s deflection in the
kth condition from his/her own grand mean. The random variables β
i for participants
i = 1, 2
, ..., 12 are sampled independently from a common group-level Gaussian distribution with group-level parameters μ
β and σ
β. The individual deflections θ
ik are sampled from Gaussian distributions with common standard deviation σ
θ. Importantly, the latter distributions have different means
\({\mu ^{{\theta _k}}}\) that characterize the respective condition
k.
The group-level parameters
\({\mu ^{{\theta _k}}}\) are the topic of main interest in the current analyses. They are analogous to the main effect of the condition factor in traditional ANOVA. Specifically,
\({\mu ^{{\theta _k}}}\) is the group-averaged deflection from the grand mean in experimental condition
k, and thus estimates the effect of the experimental manipulation after controlling for individual differences. One technical challenge that arises at this point is to enforce the sum-to-zero constraint. Ideally, the sum of the deflections θ
ik across all conditions should equal zero for any given participant
i. To simplify the computation, this sum-to-zero constraint was approximated at the group level by enforcing
\begin{equation}{\mu ^{{\theta _2}}} = - \left( {{\mu ^{{\theta _1}}} + {\mu ^{{\theta _3}}}} \right)\end{equation}
where
\({\mu ^{{\theta _1}}}\),
\({\mu ^{{\theta _2}}}\), and
\({\mu ^{{\theta _3}}}\) are the group-means of θ
ik for
k = 1, 2, and 3, respectively. In Model-3, which has only two conditions, this reduced to
\({\mu ^{{\theta _2}}} = - {\mu ^{{\theta _1}}}\).
Instead of placing priors on
\({\mu ^{{\theta _k}}}\) directly, we placed priors on their effect sizes, as suggested by
Lee and Wagenmakers (2014). The effect sizes were denoted as
\({d^{{\theta _k}}}\) and defined as
\({d^{{\theta _k}}} = {\mu ^{{\theta _k}}}/{\sigma ^\theta }\). Following common practice (e.g.,
Rouder, Speckman, Sun, Morey, & Iverson, 2009), we used the standard Gaussian as the priors on effect sizes in our model. Note that because the standard deviation, σ
θ, is common for all conditions, the sum-to-zero constraint applies to the effect sizes
\({d^{{\theta _k}}}\) as well as the group means
\({\mu ^{{\theta _k}}}\) . For other group-level parameters that are not the focus of the current analysis, we used priors that contain very little information so that the results of the analysis would be largely driven by the data. We placed weakly informative Gaussian priors on the group-means other than
\({\mu ^{{\theta _k}}}\), and placed noninformative priors with large range,
Uniform (0, 10), on the group-level standard deviations, as suggested by
Gelman (2006).
The models were implemented in JAGS (
Plummer, 2003). Results of each model were based on two chains of Markov chain Monte Carlo (MCMC), each consisting of 9000 samples collected following a burn-in period of 1000 samples. Convergence of the chains was confirmed by visually examining the trace plots of all the group-level parameters. Bayesian posterior predictive distributions fit well to the corresponding distributions of the observations, indicating good model fit.
In Bayesian statistics, the reliability of an effect can be evaluated by the degree of separations among distributions of the posterior estimates under different levels of the manipulation. Specifically, we used 95% highest density interval (HDI) of a distribution to characterize the range of the estimates that is credible, given the data and model assumption (
Kruschke, 2015).
Figures 6,
7, and
10 plot the HDIs for the group-level effect sizes
\({d^{{\theta _k}}}\) estimated from the corresponding data set.