Abstract
Typical models of early spatial vision are based on a common, generic structure: First, the input image is processed by multiple spatial frequency and orientation selective filters. Thereafter, the output of each filter is non-linearly transformed, either by a non-linear transducer function or, more recently, by a divisive contrast-gain control mechanism. In a third stage noise is injected and, finally, the results are combined to form a decision. Ideally, this decision is consistent with experimental data (Legge and Foley, 1980 Journal of the Optical Society of America, 70(12) 1458-1471; Watson and Ahumada, 2005 Journal of Vision, 5 717-740).
Often a Gabor filter bank with fixed frequency and orientation spacing forms the first processing stage. These Gabor filters, or Gaussian derivative filters with suitably chosen parameters, both visually resemble simple cells in visual cortex. However, model predictions obtained with either of those two filter banks can deviate substantially (Hesse and Georgeson, 2005 Vision Research, 45 507-525). Thus, the choice of filter bank potentially influences the fitted parameters of the non-linear transduction/gain-control stage as well as the decision stage. This may be problematic: In the transduction stage, for example, the exponent of a Naka-Rushton type transducer function is interpreted to correspond to different mechanisms, e.g. a mechanism based on stimulus energy if it is around two.
Here we systematically examine the influence of arbitrary choices regarding filter bank properties–the filter form, number and additional parameters–on the psychophysically interesting parameters at subsequent stages of early spatial vision models. We reimplemented different models within a Python modeling framework and report the modeling results using the ModelFest data (Carney et al., 1999, DOI:10.1117/12.348473).