Abstract
Encoding models based on neural networks make accurate predictions of brain activity. However, the relative opacity of these models has limited the capacity of this approach to reveal human-level experimental insights into cortical organization and function. Using a technique we call Voxel-Weighted Activation Maximization (VWAM), we show that neural networks can be used to visualize selectivity–and modulate cortical responses–across human visual cortex. We extracted the feature values from all layers of an image-classification neural network for a set of diverse naturalistic stimuli. We then used regularized linear regression to map these features onto corresponding BOLD fMRI responses. We leveraged this mapping to iteratively update a gray starting image, producing stimuli predicted to modulate responses in single or multiple cortical regions. Resulting images qualitatively comported with known selectivity, e.g. images generated to maximize OFA responses tended to be face-like and images generated for PPA were scene-like. To test whether these images elicited differential responses as intended, we generated and measured responses to 100 images, each designed to maximize responses in one of five ROIs–V3, LO, EBA, FFA or RSC. Stimuli generated for a given region consistently drove responses in that region more than images targeting other regions. As a more stringent test of our model, we selected naturalistic images which produced strong responses in a set of two functionally similar regions and then modified these images to maximize the difference between their responses. In many (though not all) cases we did differentially modulate responses in functionally related regions (OFA versus FFA, OPA versus PPA), suggesting our approach is suitable for characterizing fine-tuned distinctions across visual cortex. Overall, these findings not only support the strength of our approach as an exploratory tool for visualizing cortical selectivity, but demonstrate its validity as an experimental technique for selectively modulating diverse visual areas.