Abstract
Current predictive models of natural scene encoding by retinal ganglion cells perform poorly, especially when tested on stimuli that they are not trained on. This failure is likely to originate at least in part because the functional architecture (e.g. locations of linear and nonlinear components) of these models does not match that of real retinal circuits. We take an empirical approach to simplify the structure of natural scenes. We find that reducing natural movies to 16 linearly integrated regions describes ~80% of the structure of parasol RGC spike responses. We test our understanding by using these simplified stimuli to create high-dimensional metamers that recapitulate the spike response of full-field naturalistic movies. Finally, we identify the retinal computations that convert natural images in 16-dimensional space into 1-dimensional spike outputs. This work provides a description of spatial integration that generalizes well across images and that corresponds closely to known anatomical and functional aspects of retinal circuits.\
Funding: Supported by EY028542 to FR.