Purchase this article with an account.
Kendrick Kay, Thomas Naselaris, Jack Gallant; Estimation of voxel receptive fields in human visual cortex using natural images. Journal of Vision 2007;7(9):79. doi: https://doi.org/10.1167/7.9.79.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
A central goal of sensory neuroscience is to discover what stimulus features are represented by the visual system. Previous fMRI studies of this issue have been limited by signal averaging across entire visual areas (lack of precision) and by evaluating only a few stimulus conditions (lack of generality). We circumvented these problems by adopting a system identification approach. In brief, system identification provides a functional model (the ‘receptive field’) that describes how each voxel transforms visual stimuli into BOLD signals.
We recorded BOLD signals from human visual cortex (4 T, gradient-echo EPI, 2 × 2 × 2.5 mm, 1 Hz) during passive viewing of full-field, grayscale natural photos (∼2000 distinct photos). We then estimated the receptive field (RF) of each voxel in terms of the Berkeley Wavelet Transform (BWT). The BWT expresses each RF in terms of wavelets that are tuned along several dimensions: position, orientation, spatial frequency, and phase. Each RF consists of a collection of excitatory wavelets (representing features that increase the BOLD signal) and suppressive wavelets (representing features that decrease the BOLD signal).
The RF of a typical voxel from area V1 consists of many excitatory wavelets that are confined to a small region of the visual field, and that span a broad range of orientations, spatial frequencies, and phases. This diversity is expected since each voxel pools the activities of many different neurons. The quality of the RFs was assessed by quantifying how well each RF predicted responses to novel photos. Predictive power is remarkably high in area V1 (median correlation between observed and predicted responses ∼0.6). However, predictive power declines markedly in areas V2, V3, and V4. This decline likely reflects the fact that higher visual areas represent more complex features that are not described efficiently by the BWT.
This PDF is available to Subscribers Only