Abstract
Mid-level visual features, such as texture and contour, provide a computational link between low- and high-level visual representations. While the detailed nature of mid-level representations in the brain is not yet fully understood, past work has shown that a texture statistics model (P-S model; Portilla and Simoncelli, 2000) captures key aspects of neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex regions to natural scene images. To examine this, we constructed single voxel encoding models based on P-S statistics and fit the models to human fMRI data from the Natural Scenes Dataset (Allen et al., 2021). We demonstrate that our texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas as well as higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex voxels suggests that the representation of texture statistics features is widespread throughout visual cortex, potentially playing a role in higher-order visual processing. Furthermore, we use variance partitioning analyses to identify which features are most uniquely predictive of brain responses, and show that the contribution of higher-order texture features increases from early areas to higher areas on the ventral and lateral surfaces of the brain. We also show that patterns of sensitivity to individual texture model features can be used to identify key components of the overall representational space within visual cortex. These results provide a key step forward in characterizing how mid-level feature representations emerge across the visual system, and how they may contribute to higher-order processes like object and scene recognition.