Abstract
Deep learning has reached the domain of visual perception with artificial models trained on image classification tasks, interestingly expressing some degree of similarity with human mechanisms. Currently, encoding/decoding operations of fMRI activation to features of interest usually stick to individual linear regressions per voxel/vertex. Modelers mitigate associated limitations with carefully hand-crafted linearizing features, however, the multidimensionality and intrinsic non-linearities of artificial neural networks could further improve domain adaptation, and even capture brain area interactions. One explanation of favoring simple models is the lack of interpretability of deep learning, i.e. the ability to compare informational content between different brain areas, for one feature, and across different features. We overcome this issue by introducing a new method called Integrated Gradient Correlation, IGC, completing the original IG attribution method. We also demonstrate the relevancy of our approach by investigating the representation of image statistics using the NSD dataset: a public fMRI dataset consisting of 70k BOLD activations acquired during a long term image recognition task. We particularly focused on surface-based data (fsaverage), limited to visual cortex ROIs (e.g. V1-V4, bodies, places). Statistics under scrutiny encompassed three first moments of image luminance distributions usually associated with human texture perception (i.e. mean luminance, contrast, and skewness), as well as a higher level statistic related to spatial luminance distributions (i.e. 1/f slope). Then, we evaluated several decoding models: traditional individual linear regressors, multidimensional linear models trained per ROI and on the whole visual cortex, and finally different deep architectures (sequences of fully connected layers, and/or graph convolutional layers). IGC results show that deep models provide significantly more accurate decoding predictions, and more informative/selective brain activation patterns, coherent with the literature. Consequently, our method could find applications beyond visual neuroscience, and become beneficial to any scientific inquiry using deep models.