Abstract
Neural representations in visually responsive brain regions are predicted well by features within deep hierarchical convolutional neural networks (HCNN's) trained for visual recognition (Yamins et al. 2014, Khaligh-Razavi & Kriegeskorte 2014, Cichy et al. 2016). Additionally, salience maps derived from HCNN-features produce state-of-the-art prediction of human eye movements in natural images (Kümmerer et al. 2015, Kümmerer et al. 2016). Thus, we explored whether HCNN models might support representation of spatial attention in the human brain. We computed salience maps from HCNN features reconstructed from functional magnetic resonance imagining (fMRI) activity and then tested whether these fMRI-decoded salience maps predicted eye movements. We measured brain activity evoked by natural scenes using fMRI while participants (N=5) completed an old/new continuous recognition task and in a separate session measured eye movements for the same natural scenes. Partial least squares regression (PLSR) was then used to reconstruct from BOLD activity features derived from five layers of the VGG-19 network trained for scene-recognition (Simonyan & Zisserman 2015, Zhou et al. 2016). Spatial activity in the reconstructed VGG-features was then averaged across channels (filters) within each layer and across all layers to compute an fMRI-decoded salience map for each image. Group-average fMRI-decoded salience maps from regions in occipital, temporal, and parietal cortex predicted eye movements (p< 0.001) from an independent group of observers (O'Connell & Walther 2015). Within-participant prediction of eye movements was significant for fMRI-decoded salience maps from V2 (p< 0.05). These results show that representation of spatial attention priority in the brain may be supported by features similar to those found in HCNN models. Our findings also suggest a new method for evaluating the biological plausibility of computational salience models.
Meeting abstract presented at VSS 2017