Abstract
Efficient coding theory is a normative principle that has been used to explain the structure of biological sensory systems. Particular instantiations of the theory have been shown to be consistent with various aspects of early visual processing, including receptive field (RF) selectivity and rectifying nonlinearities. Here, we examine coding efficiency of natural images in a cascaded nonlinear-linear-nonlinear model. Luminance values of each pixel first pass through an instantaneous nonlinearity with additive input noise, and this transduced signal is then spatially integrated by a population of linear-nonlinear neurons with additive output noise. We examine a discrete set of noise levels and transduction nonlinearities. For each combination, the linear RFs and parametrized nonlinearities of the entire population are optimized, to maximize the mutual information between the pixels of a set of natural images and the model responses, under a metabolic cost constraint. Building on previous work (Karklin & Simoncelli, 2011), we find that the choice of noise levels and transduction nonlinearity have profound effects on the qualitative properties of the optimal population. With small output noise, all optimal filters are oriented band-pass RFs, comparable to V1 simple cells, and similar to those found using Independent Component Analysis (Bell & Sejnowski, 1997) or Sparse Coding (Olshausen & Field, 1996). But with increasing output noise, a growing proportion of neurons adopt non-oriented low-pass RFs subdivided into ON/OFF subpopulations, similar to retinal ganglion cells. This transition is striking and highly reliable under a saturating gain-control transduction nonlinearity (e.g. a Naka-Rushton function), but is partial and incomplete under linear or logarithmic transduction nonlinearities. It remains to be seen whether photoreceptor transduction nonlinearities and noise levels, along with ganglion cell noise levels, are within the regime that theoretically predicts the emergence of ON/OFF RFs. In addition, we are currently exploring the generalization of this framework to spatio-temporal visual inputs.
Meeting abstract presented at VSS 2018