These results indicate that familiar size affected both perceived size and distance, in line with the empiricists’ theoretical position (
Ittelson & Ames, 1950;
Kilpatrick & Ittelson, 1953). That is, Rubik's cubes were perceived larger and farther than dice when their actual physical sizes and distances were matched, which we term FSEs on size perception and distance perception, respectively. Most notably, familiar size affected perception even when potent oculomotor cues were available at near distances (<1 meter).
Generally, familiar size had a greater effect on object perception under conditions of high uncertainty. For perceived size, the FSE was stronger during monocular pinhole viewing than binocular viewing (see
Figure 3c). In addition, for perceived size, the FSE was stronger for the two conditions with intermediate retinal angles (4.7 degrees) than extreme retinal angles (1.3 degrees and 15.9 degrees; see
Figure 3d). Thus, for our limited range of stimuli, extreme retinal angles enabled quite accurate perceptual estimates, whereas intermediate retinal angles evoked less veridical perception, even when combined with oculomotor cues. Interestingly, perceived size for intermediate (4.7 degrees) stimuli was largely based on familiar size during monocular pinhole viewing, when depth cues were minimal, but largely based on physical size during binocular viewing, when rich depth cues were available. Notably, however, even in the binocular condition, familiar size still affected size perception. Familiar size also affected perceived distance, although the magnitude of the FSE did not differ statistically between binocular and monocular pinhole viewing (see
Figure 4c).
When size and distance are examined together, it becomes evident that perceived size and distance are yoked, albeit imperfectly, resulting in partial size constancy (e.g. when perceived size decreased, the perceived distance also decreased, such that the combination of perceived size and perceived distance falls along a diagonal consistent with a given retinal angle, as shown in
Figure 2).
The current results not only help to reconcile the longstanding debate about FSE between the empiricists and the nativists, but they also extend the conclusions that can be drawn from behavioral studies. A growing number of recent behavioral studies have found advantages in processing speed (namely, reaction times) for objects with familiar sizes that are congruent (versus incongruent) with their relative (
Konkle & Oliva, 2012) or physical sizes (
Fisher & Sperandio, 2018) in the real world. The current results show that familiar size also affects perceptual accuracy and that these effects are surprisingly potent, being present even when oculomotor cues to distance are available (and could be combined with information about distance to infer size).
The current experimental design provides several additional advantages over past research. First, the multifactorial design used here allowed us to assess the impact of multiple visual features that naturally co-occur in the real world (familiar size, physical size, retinal size, and physical distance) to examine effects on both size and distance perception. As a result, we show that although these percepts are correlated, the FSE on size perception was more dependent upon the available depth cues than the FSE on distance perception. This finding is consistent with other research showing that perceived distance, size, and shape are not always interpreted to be consistent with one another (
Brenner & van Damme, 1999). Second, unlike past experiments that examined perception of a single familiar object (e.g. playing cards or chairs) at typical and atypical sizes, here, the stimulus set consisted of two objects with carefully chosen identities. These objects have (1) strict canonical sizes, each eliciting a specific familiar size representation (i.e. drawings of objects from memory differed by less than 5 mm between participants) and (2) matched shapes (cubes) viewed under the same conditions (perspective and lighting). As such, the two object identities acted as each other's controls and yielded robust FSEs. Third, we used real objects rather than images, which may be important because real objects are expected to have congruent familiar sizes, whereas images are not. That is, we are often surprised and amused to see miniatures or jumbo versions of real stimuli; whereas, under- or oversized photographs are commonplace and unremarkable (although photographic objects are recognized faster when their presented size matches their familiar size;
Fisher & Sperandio, 2018). Moreover, real-world objects always produce consistent cues to depth, where retinal angle and oculomotor cues directly correspond to object physical size and distance. Thus, the current findings of strong FSEs under binocular viewing are even more surprising. Fourth, our inclusion of two combinations of size and distance that yield images subtending an equivalent retinal angle (4.7 degrees) eliminates potential confounds of retinal angle in size estimation. Even for these ambiguous retinal angles, we find FSEs on size and distance, which provides a counterpoint to a recent claim that FSEs are absent when confounds of retinal size are removed (
Mischenko, Negishi, Gorbunova, & Sawada, 2020). Finally, although the use of only two object identities may limit generalizability, the current design remains an enhancement of past studies that only considered one object (e.g.
Epstein et al., 1961;
Franklin & Erickson, 1969).