Abstract
Object perception recruits a cortical network that encodes both visual and non-visual properties. Of particular interest is the degree to which perceptual representations of objects are tied to semantic knowledge. Prior studies examining the neural representation of conceptual knowledge of objects have not always dissociated perceptual and non-perceptual components, but instead attribute “neurosemantics” to both visual and non-visual cortical areas (Just et al., 2010; Mitchell et al., 2008). Here we used fMRI to unravel these two components by contrasting voxel population responses for 60 object pictures to voxel population responses for 60 written nouns corresponding to the same objects. BOLD responses in the ventral temporal cortex were analyzed using multi-voxel pattern analysis (MVPA) and a searchlight procedure (e.g., Kriegeskorte et al., 2007). These analyses provide, for both the picture and the word conditions, a matrix of neural dissimilarities between stimulus pairs drawn from all 60 items. This comparison allowed us to directly assess the degree to which non-visual inputs specifying visual objects recruit visual representations. Critically, we were also interested in the featural dimensions underlying these neural codes for objects. To accomplish this we developed multiple models of visual object similarity - ranging from pixel-wise comparisons to distances in feature spaces employed in computer vision - and predicted the structure of the neural responses for both the picture and the word conditions. Although we were able to reliably identify different neural activation patterns for the two conditions, none of our implemented models of visual object similarity provided a close match with BOLD activation patterns past occipital regions. Alternatively, we obtained better data-driven intuitions about the features underlying our neural dissimilarity matrices through two methods: by learning linear filter models of individual voxels; and by image clustering based on the multi-voxel dissimilarity matrices themselves.
Funding provided by the Perceptual Expertise Network (#15573-S6), a collaborative award from James S. McDonnell Foundation, by the Temporal Dynamics of Learning Center at UCSD (NSF Science of Learning Center SBE-0542013), by the PA Department of Health, Commonwealth Universal Research Enhancement (C.U.R.E.) program, Formula Award Number 4100050890, 2010, by an NIH EUREKA Award (#1R01MH084195-01) to MJT, by NSF IGERT and by R.K. Mellon Foundation.