All eight pairs of trainees completed the first three rounds of training and transfer measurement (
Figure 2, top), whereas six out of these eight pairs also completed the additional fourth round (
Figure 2, bottom). In the
Appendix, data from every individual trainee are shown.
Our focus in the hypothesis testing was whether the transfer performance was dependent on training. Specifically, we asked whether there was any interaction effect in the following 2 × 2 analysis of variance (ANOVA): Training (4° vs. 8°) × Transfer (4° vs. 8°). To anticipate, using a variety of measures, we found that the transfer performance was consistently dependent on whether training was 4° or 8° discrimination. Specifically, for 8° transfer discrimination, the 8° trainees discriminated better than their 4° counterparts. The opposite was true for 4° transfer discrimination. The detailed analyses are as follows.
Although the trainees were paired, the pairing was primarily to ensure that the two training groups were balanced in training schedule, the training motion directions, and trainee genders. Such pairing however helped little in terms of reducing individual differences, which are typically large in motion perceptual learning. We therefore conducted an ANOVA with 16 individual, rather than 8 pairs of, subjects. (Otherwise, ANOVA gave rise to slightly larger
p values, due to reduced degrees of freedom from 14 to 7.) A two-way ANOVA was performed with training (4° vs. 8°) and transfer (4° vs. 8°) as the main factors. The dependent variable was the amount of
d′ improvement along the transfer direction from the first to last measurement. The main effect of training was not significant
F(1, 14) < 1. The main effect of testing was highly significant,
F(1, 14) = 73.29,
p ≪ 0.001, not surprisingly, since 8° discriminating was easier than 4°. Importantly, the interaction was significant,
F(1, 14) = 9.00,
p = 0.01. This means that transfer was dependent on the training difficulty (
Figure 3). Similar results were obtained if the last
d′ measurement in transfer, rather than
d′ improvement, was used. The interaction was significant,
F(1, 14) = 7.19,
p < 0.02.
We then looked at another way of transfer measurement, normalized improvement, defined as (final
d′ − pretraining
d′)/(pretraining
d′) in the transfer task. The interaction effect was not significant,
F(1, 14) < 1. Upon a closer look, however, the large variance in the data was mainly due to a single trainee, YNN. YNN's pretraining 4° discrimination
d′ was only 0.26, giving rise to a normalized improvement larger than anybody else's (
Figure 4). After removing this data point, we found that the interaction became significant,
F(1, 13) = 7.39,
p < 0.02.
Next, we looked at all
d′ scores throughout the
Experiment. Each trainee had four or five
d′ scores for the 4° transfer discrimination, from which a linear slope was obtained. A linear slope for the 8° transfer discrimination was similarly obtained for each trainee. A similar 2 × 2 ANOVA was performed using these slope data. The interaction was again significant,
F(1, 14) = 9.12,
p < 0.01. The slope of the 8° transfer performance for the 8° trainees was numerically greater than for the 4° trainees (0.034 vs. 0.028). The slope of the 4° transfer performance was numerically greater for the 4° trainees than for the 8° trainees (0.021 vs. 0.017) (
Figure 5).
We then correlated the transfer slopes with the training slopes. Each trainee contributed two transfer slopes (on 4° and 8° discriminations) and one training slope (either on 4° or 8° discrimination). The data are shown in
Figure 6. Each of the four correlations was statistically significant (
p < 0.05). These results indicate that performance in the transfer direction depended on training performance. In other words, it appears incorrect to characterize the transfer performance as independent of the training. In order to further verify this, we randomly scrambled the pairing between the transfer and training slopes, such that each new pair of data were from two trainees, rather than from only one trainee. After each scrambling, we computed a new correlation for each panel in
Figure 6. We repeated this procedure 10,000 times and obtained four distributions of the correlation coefficients. We asked whether the mean of each distribution was reliably different from zero. In all four cases, no mean correlation coefficient was significantly different from zero (
t < 1). This result indicated that a trainee's transfer performance depended on their training performance.
Also interestingly, the correlation coefficient between the 8° transfer discrimination with 8° training was higher (0.78, the top-left panel) than that between the 4° transfer discrimination with 8° training (0.72, the top-right panel). The correlation coefficient between the 4° transfer discrimination with 4° training was also higher (0.91, the bottom-right panel) than that between the 8° transfer discrimination with 4° training (0.84, the bottom-left panel). In order to access the reliability of these two differences, we performed bootstrapping analysis for each of the two training groups (10,000 samples with replacement) (Efron & Tibshirani,
1993). In both cases, the difference was significant (
p < 0.001,
t > 25). This was supportive evidence that transfer was dependent on training.
Finally, we tested whether the transfer performance depended on task difficulty of training from the following perspective. If the transfer performance only depended on the similarity between the transfer and training tasks, regardless of task difficulty, then the 4° and 8° discrimination performance should be symmetric with each other. Namely, the absolute difference in performance between 4° and 8° discrimination for the 4° trainees should be the same as for the 8° trainees. We conducted such tests, which were different from the interaction effects above because the differences were all in absolute values. In
Figure 3 left,
t(14) = 3.04,
p < 0.01. In
Figure 3 right,
t(14) = 2.68,
p = 0.018. In
Figure 4 bottom,
t(13) = 1.84,
p = 0.08. In
Figure 5,
t(14) = 3.03,
p < 0.01. These results rejected the symmetry hypothesis and suggested that the pattern of the results was not completely due to the similarity between training and transfer tasks but that transfer performance depended on the difficulty of the training task.