Abstract
Decades of research in the cognitive and neural sciences have shown that shape perception is crucial for object recognition. However, it remains unknown how object shape is represented to accomplish recognition. Here we used behavioral and neural techniques to test whether human object representations are well described by a model of shape based on an object’s skeleton when compared with other computational descriptors of visual similarity. Skeletal representations may be an ideal model for object recognition because they (1) provide a compact description of a shape’s structure by describing the relations between contours and component parts, and (2) provide a metric by which to compare the visual similarity between shapes. In a first experiment, we tested whether a model of skeletal similarity was predictive of human behavioral similarity judgments for novel objects. We found that the skeletal model explained the greatest amount of unique variance in participants’ judgments (33.13%) when compared with other models of visual similarity (Gabor-jet, GIST, HMAX, AlexNet), suggesting that skeletal descriptions uniquely contribute to object recognition. In a second experiment, we used fMRI and representational similarity analyses to examine whether object-selective regions (LO, pFs), or even early-visual regions, code for an object’s skeleton. We found that skeletal similarity explained the greatest amount of unique variance in LO (19.32%) and V3 (18.74%) in the right hemisphere (rLO; rV3), but not in other regions. That a skeletal description was most predictive of rLO is consistent with its role in specifying object shape via the relations between components parts. Moreover, our findings may shed new light on the functional role of V3 in using skeletons to integrate contours into complete shapes. Together, our results highlight the importance of skeletal descriptors for human object recognition and the computation of shape in the visual system.