Abstract
Accumulating evidence supports the idea that visual cortex object space reflects behaviorally-relevant dimensions beyond those necessary for object recognition; for instance, the overlap between hands and tools cannot be explained by either visual (e.g., shape) or semantic (e.g., animacy) accounts of object recognition. Instead, this representational overlap can be explained by action-related properties common to both categories. Convolutional neural networks (CNNs) have been proposed as a model of visual cortex, but most of these networks are trained in object recognition and might lack the ability to model representations that go beyond identification of objects. Here, we characterize the action-related representational space in occipitotemporal cortex (OTC) and test the degree to which current state-of-the-art models of human vision capture this organization. We analyzed an fMRI dataset in which participants were presented with images characterized by a different degree of action properties of either body parts (hands, whole-bodies, and faces) or objects (tools, manipulable objects and non-manipulable objects). In left lateral OTC, we found a topographically organized action gradient: from dorsal-posterior to ventral-anterior we observed selective and partly overlapping activations for bodies, hands, tools and manipulable objects, with tools showing a higher degree of overlap with hands than the other stimuli. To further characterize this action gradient, we computed two multivariate indices: the grasping and the action index that capture specific action-related components in OTC representational space. This activation gradient was reflected at the level of multivariate representations, and these effects could not be replicated by CCNs. Altogether, our results show that the object space in visual cortex reflects behaviourally-relevant dimensions, and highlight the need for more ecological tasks to improve the correspondence between the primate brain and artificial neural models.