Free
Research Article  |   April 2009
Optimum spatiotemporal receptive fields for vision in dim light
Author Affiliations
Journal of Vision April 2009, Vol.9, 18. doi:https://doi.org/10.1167/9.4.18
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Andreas Klaus, Eric J. Warrant; Optimum spatiotemporal receptive fields for vision in dim light. Journal of Vision 2009;9(4):18. https://doi.org/10.1167/9.4.18.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Many nocturnal insects depend on vision for daily life and have evolved different strategies to improve their visual capabilities in dim light. Neural summation of visual signals is one strategy to improve visual performance, and this is likely to be especially important for insects with apposition compound eyes. Here we develop a model to determine the optimum spatiotemporal sampling of natural scenes at gradually decreasing light levels. Image anisotropy has a strong influence on the receptive field properties predicted to be optimal at low light intensities. Spatial summation between visual channels is predicted to extend more strongly in the direction with higher correlations between the input signals. Increased spatiotemporal summation increases signal-to-noise ratio at low frequencies but sacrifices signal-to-noise ratio at higher frequencies. These results, while obtained from a model of the insect visual system, are likely to apply to visual systems in general.

Introduction
Vision is an important sense for most species within the animal kingdom and even in the dimmest habitats animals have functional eyes (Warrant, 2004). A nocturnal lifestyle has several advantages, such as reduced predation pressure and access to underexploited sources of food (Roubik, 1989; Warrant, Porombka, & Kirchner, 1996; Wcislo et al., 2004). However, nocturnal life also has a crucial disadvantage: the scarcity of light. At night, light intensities can be up to 9 orders of magnitude lower than during the day at the same location and this scarcity of light can make vision unreliable: visual signals in dim light are contaminated by visual “noise”. Part of this noise arises from the stochastic nature of photon arrival and absorption: each sample of absorbed photons (or signal) has a certain degree of uncertainty (or noise) associated with it. The relative magnitude of this uncertainty is greater at lower rates of photon absorption, and these quantum fluctuations set an upper limit to the visual signal-to-noise ratio (de Vries, 1943; Rose, 1942). The dimmer the light level, the lower the signal-to-noise ratio (Land, 1981; Snyder, 1979), and a visual system straining to see at night should therefore capture as much light as possible. This can be achieved optically by enlarging the receptive fields of the photoreceptors and/or by widening the pupil of the eye (Kirschfeld, 1974; Land, 1981; Nilsson, 1989; Warrant, 2004). A general strategy to increase light capture is to simply increase eye size (Jander & Jander, 2002; Land & Nilsson, 2002). This, however, is constrained by the metabolic costs of neural information processing (Laughlin, de Ruyter van Steveninck, & Anderson, 1998; Niven, Andersson, & Laughlin, 2007), and by locomotory costs (especially for flying insects), introduced by the additional weight of the eyes. 
In addition to optical improvements, visual sensitivity can also be increased by neural adaptations. Adjacent points in natural images are correlated, and this correlation falls exponentially as the distance between the points increases (Srinivasan, Laughlin, & Dubs, 1982), a fact that has been further studied by analyzing the power spectra of natural scenes (van der Schaaf & van Hateren, 1996). The spectral power in terrestrial images falls with increasing spatial frequency fs as 1/fs2 (Dong & Atick, 1995; Field, 1987; Simoncelli & Olshausen, 2001; van der Schaaf & van Hateren, 1996). Due to correlation within natural images, neighboring visual channels share some information, and thus a proportion of the signal generated in one channel can be predicted from the signals generated in neighboring channels (Srinivasan et al., 1982; Tsukamoto, Smith, & Sterling, 1990; van Hateren, 1992c). In bright light this “predictive coding” leads to visual sampling involving lateral inhibition between adjacent channels (band-pass filtering), whereas in dim light sampling involves spatial summation of signals from groups of neighboring channels (low-pass filtering). As shown theoretically, a strategy of neural summation of light in space and time can enable an eye to improve visual reliability in dim light (Tsukamoto et al., 1990; van Hateren, 1992c; Warrant, 1999). Neural summation will, however, cause a decrease in spatial and temporal resolution (Warrant, 1999). 
Despite its difficulties, many nocturnal animals—including insects—rely on vision for the tasks of daily life (Warrant, 2007, 2008a). Insects have compound eyes, an eye type constructed from an assembly of repeated optical units, the “ommatidia” (Figure 1A). Nocturnal insects, including most moths and many beetles, typically have refracting superposition compound eyes, a design that allows single photoreceptors in the retina to receive focussed light from hundreds (and in some extreme cases, thousands) of corneal facet lenses (Figure 1C). This design represents a vast improvement in sensitivity over the apposition compound eye (Figure 1B), a design in which single photoreceptors receive light only from the single corneal facet lens residing in the same ommatidium. Not surprisingly, apposition eyes are typical of diurnal insects active in bright sunlight, such as bees, flies, butterflies, and dragonflies. Interestingly, despite their disadvantages for vision in dim light, some nocturnal insects have apposition compound eyes. In tropical rainforests, for instance, the competition for resources and the threat of predation are fierce, and many species of bees and wasps—all with apposition eyes—have evolved a nocturnal lifestyle, an evolutionary transition that is probably relatively recent (Wcislo et al., 2004). Animals with apposition eyes have evolved a number of adaptations that improve sensitivity in dim light (Barlow, Kaplan, Renninger, & Saito, 1987; Warrant, 2007, 2008b) and that allow visually guided behavior at lowest light intensities (Barlow & Powers, 2003; Warrant et al., 2004). For example, the apposition eyes of the nocturnal sweat bee, Megalopta genalis, possess larger corneal facet lenses and wider rhabdoms than those of closely related diurnal bees, and thus enjoy an increased optical sensitivity (Greiner, Ribi, & Warrant, 2004). Further improvements in visual sensitivity are likely to come from enhanced visual gain in the photoreceptors (Barlow, Bolanowski, & Brachman, 1977; Barlow et al., 1987; Frederiksen, Wcislo, & Warrant, 2008) and from a strategy of visual summation in space and time (Theobald, Greiner, Wcislo, & Warrant, 2006; Warrant et al., 2004). Morphological, electrophysiological, and theoretical evidences (Warrant, 2008b) suggest that widely arborizing second-order cells in the lamina (the LMC cells) might mediate a spatial summation of visual signals generated in the retina (Greiner, Ribi, & Warrant, 2005; Greiner, Ribi, Wcislo, & Warrant, 2004). 
Figure 1
 
Compound eyes. (A) A schematic longitudinal section (and an inset of a transverse section) through a generalized Hymenopteran ommatidium (from an apposition eye, see B), showing the corneal lens (c), the crystalline cone (cc), the primary pigment cells (pc), the secondary pigment cells (sc), the rhabdom (rh), the retinula cells (rc), the basal pigment cells (bp), and the basement membrane (bm). The upper half of the ommatidium shows screening pigment granules in the dark-adapted state, while the lower half shows them in the light-adapted state. Redrawn from Stavenga and Kuiper (1977), with permission. (B) A focal apposition compound eye. Light reaches the photoreceptors exclusively from the small corneal lens (co) located directly above, within the same ommatidium. This eye design is typical of day-active insects. cc = crystalline cone. (C) A refracting superposition compound eye. A large number of corneal facets, and bullet-shaped crystalline cones of circular cross-section (inset), collect and focus light across the clear zone of the eye (c.z.) toward single photoreceptors in the retina. Several hundreds, or even thousands, of facets service a single photoreceptor. Not surprisingly, many nocturnal and deep-sea animals have refracting superposition eyes and benefit from the significant improvement in sensitivity. (B) and (C) Courtesy of Nilsson (1989).
Figure 1
 
Compound eyes. (A) A schematic longitudinal section (and an inset of a transverse section) through a generalized Hymenopteran ommatidium (from an apposition eye, see B), showing the corneal lens (c), the crystalline cone (cc), the primary pigment cells (pc), the secondary pigment cells (sc), the rhabdom (rh), the retinula cells (rc), the basal pigment cells (bp), and the basement membrane (bm). The upper half of the ommatidium shows screening pigment granules in the dark-adapted state, while the lower half shows them in the light-adapted state. Redrawn from Stavenga and Kuiper (1977), with permission. (B) A focal apposition compound eye. Light reaches the photoreceptors exclusively from the small corneal lens (co) located directly above, within the same ommatidium. This eye design is typical of day-active insects. cc = crystalline cone. (C) A refracting superposition compound eye. A large number of corneal facets, and bullet-shaped crystalline cones of circular cross-section (inset), collect and focus light across the clear zone of the eye (c.z.) toward single photoreceptors in the retina. Several hundreds, or even thousands, of facets service a single photoreceptor. Not surprisingly, many nocturnal and deep-sea animals have refracting superposition eyes and benefit from the significant improvement in sensitivity. (B) and (C) Courtesy of Nilsson (1989).
Thus, when a diurnal insect with apposition eyes—like the ancestor of the nocturnal bee Megalopta—began to evolve a nocturnal lifestyle, the optical and neural structures of the eye altered according to natural selection (Frederiksen & Warrant, 2008). As the diurnal ancestor conquered dimmer and dimmer niches, its eyes evolved greater sensitivity: widening rhabdoms increased the visual fields of the photoreceptors, and the dendritic trees of the LMC cells became more extensive, possibly in order to mediate spatial summation. How should these optical and neural changes evolve in order to continuously provide optimal visual performance during the evolution of a nocturnal lifestyle? 
The aim of this investigation is to model how an array of visual channels samples natural grayscale scenes at different light intensities and image velocities. How, for instance, do the spatiotemporal properties of this sampling “evolve” in response to an increasingly nocturnal lifestyle? In addition, do the spatiotemporal properties of visual sampling depend on the image statistics of the natural scene being viewed? The average power spectrum derived from large numbers of individual natural scenes is largely isotropic (Balboa & Grzywacz, 2003; van der Schaaf & van Hateren, 1996), and most visual modeling studies thus consider the scene to be likewise isotropic (Srinivasan et al., 1982; van Hateren, 1992c). However, the power spectra of individual natural scenes can show significant anisotropy (Balboa & Grzywacz, 2003; see also Coppola, Purves, McCoy, & Purves, 1998; Hansen & Essock, 2004), such as the rainforest habitat of the nocturnal bee Megalopta, which is dominated by vertical tree trunks. How might such natural scenes affect the sampling strategy that “evolves”? 
Theory and methods
Rationale
Natural grayscale images were used to stimulate an array of visual channels whose spatial and temporal properties could be varied. Natural grayscale images were simulated at different light levels I by superimposing photon noise ( Figure 2C). Different image velocities were simulated by shifting static grayscale images by known numbers of pixels per unit time ( Figure 2B). Here we consider only image motion in a horizontal direction and denote the horizontal image velocity by α (measured in units of visual channel widths per 10 ms, see below). 
Figure 2
 
Modeling of image motion and light intensity. (A) Original image data of size 1536 × 1024 pixels (width × height) were taken from the natural image collection of van Hateren and van der Schaaf (1998). For stimulus creation and data analysis, square subimages of size M × M are considered. (B) A moving stimulus is created by horizontally shifting the array of visual channels (white square in A) over the image. The result is a sequence of image frames (t = 0, 1, 2,…). The time between two subsequent image frames was chosen to be 10 ms. (C) Superimposing Poisson noise results in a moving stimulus that resembles the scene gathered by the visual channels at a given light intensity.
Figure 2
 
Modeling of image motion and light intensity. (A) Original image data of size 1536 × 1024 pixels (width × height) were taken from the natural image collection of van Hateren and van der Schaaf (1998). For stimulus creation and data analysis, square subimages of size M × M are considered. (B) A moving stimulus is created by horizontally shifting the array of visual channels (white square in A) over the image. The result is a sequence of image frames (t = 0, 1, 2,…). The time between two subsequent image frames was chosen to be 10 ms. (C) Superimposing Poisson noise results in a moving stimulus that resembles the scene gathered by the visual channels at a given light intensity.
We then generate an array of visual channels with a spatial receptive field of vertical and horizontal half-widths Δ ρ v and Δ ρ h, respectively, and with a temporal integration time of Δ t ms. The array of visual channels corresponds to the array of image pixels, and thus, Δ ρ v and Δ ρ h are measured in units of pixel widths (or units of visual channel widths). 
For each light intensity I (at any given image velocity α) the parameters Δ ρ v, Δ ρ h, and Δ t are allowed to “evolve” in an iterative fashion to provide an optimal sampling of the input image. By “optimal” we mean the sampling that results in a filtered output image that is most like the unfiltered initial (noiseless) image. Thus, using this rationale we will determine how the spatiotemporal properties of visual sampling “evolve” to optimize vision at dimmer and dimmer light levels. 
Image data and data generation
Natural image data are taken from the database of van Hateren and van der Schaaf (1998). The images are provided in binary format and can be easily converted into the FITS format or imported into Matlab (MathWorks, MA, USA). For all simulations, square images of size M, usually set to 128 or 256, are used. An image can be considered as a two-dimensional array of pixel intensities gk,l with spatial coordinates k and l (k, l = 0, ±1, …), which correspond to the vertical and horizontal directions in the image, respectively. In Figure 3 two images with different isotropy, and their normalized power spectra are shown (for the calculation of the one-dimensional power spectrum, see 3). The power for the isotropic scene is equally distributed in all directions (Figure 3A). In the case of the anisotropic scene there is more power in the horizontal than in the vertical direction for most spatial frequencies fs (Figure 3B). 
Figure 3
 
Image data of size 128 × 128 pixels. (A) An isotropic image and its one-dimensional and normalized power spectra. (For the calculation of the one-dimensional power spectrum, see 3.) Note that there is no distinct difference between the power in horizontal and vertical directions. (B) An anisotropic image and its one-dimensional and normalized power spectra with a considerable difference between horizontal and vertical power over a wide range of spatial frequencies.
Figure 3
 
Image data of size 128 × 128 pixels. (A) An isotropic image and its one-dimensional and normalized power spectra. (For the calculation of the one-dimensional power spectrum, see 3.) Note that there is no distinct difference between the power in horizontal and vertical directions. (B) An anisotropic image and its one-dimensional and normalized power spectra with a considerable difference between horizontal and vertical power over a wide range of spatial frequencies.
Horizontal image motion is simulated by shifting an array of visual channels over an image g k,l, resulting in a sequence of images g k,l,t ( α), t = 0, 1, 2,… ( Figure 2B; see 1). The region of interest is given by k, l = 0, …, M − 1. The time step between two subsequent image frames g k,l,t ( α) and g k,l,t+1 ( α) was chosen to be 10 ms. The parameter α describes the velocity of horizontal motion and is measured in units of receptor widths per 10 ms. For α > 0 the images g k,l,t ( α) contain motion blur. These images represent the noiseless images at the level of the photoreceptors in the retina at time steps t = 0, 1, 2, …. 
We can now superimpose photon noise onto the image sequence g k,l,t ( α) ( Figure 2B) and the result is a sequence of noisy images g k,l,t ( α,I), t = 0, 1, 2,… ( Figure 2C) with a mean intensity of I photons per visual channel per second (cf. Supplementary Figure 4). Photon arrival is a random event, which follows a Poisson distribution. Assuming a mean intensity of I photons per visual channel per second we can generate the noisy image sequence g k,l,t ( α,I) by scaling the noiseless image with its mean value to unity, multiplying it by I/100 (one image frame corresponds to 10 ms), and drawing Poisson distributed random numbers at each pixel location according to its intensity value:  
g k , l , t ( α , I ) = P o i s s ( I / 100 g ^ k , l , t ( α ) ) ,
(1)
where
g ^
k,l,t ( α) represents the normalized image sequence g k,l,t ( α) with mean intensity equal to 1 (see 1). Poiss( λ) returns a Poisson distributed random number with mean value λ
Modeling the spatiotemporal receptive field
A receptive field can be described by its spatial and temporal response functions. For high signal-to-noise ratios (bright light condition) receptive fields of lamina neurons are predicted to show center–surround antagonism (Srinivasan et al., 1982; van Hateren, 1992a, 1992c). This means that it is primarily changes in the incoming signal that are encoded and transferred to later stages of processing in the visual system. Visual channels in the surround predict the output of visual channels in the center, and only the difference in signals generated by the center and surround is sent to later stages of processing. The mechanism of lateral inhibition was first described between neighboring ommatidia in the horseshoe crab Limulus polyphemus (Hartline, Wagner, & Ratliff, 1956). Center–surround antagonism (or lateral inhibition) is generally associated with redundancy reduction in the early visual system (van Hateren, 1992a; but see also Barlow, 2001). Natural images contain “redundant” information due to their intrinsic autocorrelation, and the limited dynamic ranges of photoreceptors and neurons strongly imply the importance of eliminating predictable signal components. Lateral inhibition leads to band-pass filtering that is characterized by the enhancement of spatial edges and temporal transients (van Hateren, 1992b). However, for low signal-to-noise ratios (dim light conditions) the band-pass characteristic of receptive fields is predicted to change to a low-pass characteristic. With decreasing signal-to-noise ratio the size of the excitatory center widens and the contribution of the inhibitory surround diminishes, and disappears altogether (Barlow, Fitzhugh, & Kuffler, 1957; Batra & Barlow, 1982; Renninger & Barlow, 1979; van Hateren, 1992c). Since we are interested in the transition from bright to dim light we decided to account only for the more dominant changes in the excitatory center and model the spatial receptive field by a Gaussian function with parameters σv and σh (see also Hemilä, Lerber, & Donner, 1998): 
G(u,v)=exp[12((uσv)2+(vσh)2)],
(2)
which has its maximum value at u = v = 0. The spatial half-widths Δρv and Δρh can be calculated by simply multiplying σv and σh in Equation 2 by a factor of 2.35. 
The temporal response of photoreceptors can be modeled by a log-normal function with parameters Δ t and τ p (Howard, 1981; Howard, Dubs, & Payne, 1984; Payne & Howard, 1981). Δt and τp are measured in milliseconds and denote the half-width and the time-to-peak value of the impulse response function, respectively: 
V(t)=exp[ln2(t/τp)2σ2],t>0.
(3)
 
The integration time Δ t can be calculated by  
Δ t = 20 τ p sinh ( 1.177 σ ) .
(4)
 
In our model the values Δ t and τ p are not modeled independently but the parameter σ is set to a constant value. This means an increase of τ p comes with an increase in Δ t, and vice versa (cf. Equation 4). Howard (1981) reported a σ of 0.31 in the dark- and light-adapted photoreceptors of the locust. The dark-adapted photoreceptor in the worker honeybee has a σ of 0.28, whereas in M. genalis σ = 0.32 (Warrant et al., 2004). In all simulations we use a value of σ = 0.32. 
Filtering an image sequence with the spatiotemporal receptive field yields a number of N filtered images  
(5)
where V t and G = G k,l are discretized and normalized versions of Equations 2 and 3, respectively (see 2). V t is shifted such that its peak value is at t 0 = ⌊ T/3⌋ (⌊·⌋ denotes the floor function). This means that the images h k,l,n, n = 0, …, N − 1 in Equation 5 represent the spatiotemporally filtered versions of the frames
g ^
k,l,t 0 ( α,I), …,
g ^
k,l,t 0 + N−1 ( α,I). ( G * g) denotes convolution of an image g = g k,l,t with a Gaussian kernel G = G k,l and yields a spatially filtered image sequence (see 2). For all calculations the size of the spatial receptive field G k,l is determined by the weighting coefficients where summation falls below 1% of the on-axis amplitude. 
Modeling the “evolution” of visual sampling at decreasing light levels
The benefits of seeing well in dim light are doubtless manifold for nocturnal animals (Roubik, 1989; Warrant, 2008a; Warrant et al., 1996). For this reason we do not model feature-based filtering but instead we aim to determine the sampling strategy that gives the “best reconstruction” of a noisy stimulus. The difference between the original, noiseless image and a filtered version of the noisy image can be formulated by a mean square error criterion: 
MSE=1Npk,l=mMm1n=0N1(hk,l,ng^k,l,t0+n(α))2,
(6)
with t0 = ⌊T/3⌋ (cf. Equation 5), and Np = (M − 2m)2N. For the calculation of the MSE value (Equation 6) we use N = 10 subsequent image frames. To exclude artifacts due to padding, m has to be chosen according to the size of the receptive field (1% threshold for summation coefficients). 
For a given light intensity I and image velocity α smaller values of the MSE indicate a better “quality” of the filtered image, that is, a better match to the original noiseless image. The goal is to find the “optimal” spatiotemporal parameters Δ ρ v, Δ ρ h, and Δ t that minimize the difference between the filtered and noiseless image sequences for a given combination of I and α:  
MSE min .
(7)
 
MSE
is the average over the MSE values of 5 noisy instances of the same image sequence filtered with the same receptive field. To find a local minimum of Equation 7 we apply the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method, which is widely used for nonlinear optimization problems and belongs to the class of quasi-Newton methods (see, e.g., Press, Teukolsky, Vetterling, & Flannery, 2007). The optimal parameter values as obtained by the BFGS method depend, in general, on the initial parameter values. 
For a given image velocity α all simulations begin at bright light intensities with a sampling that includes only a sole visual channel integrating signals over a single image frame (i.e., Δ ρ v = Δ ρ h = 0.75 receptor widths, Δ t = 4.6 ms). Dimmer and dimmer light levels are subsequently simulated and the spatiotemporal properties found optimal at a certain light intensity are then used as initial values for the next dimmer light intensity. Thus, the spatiotemporal receptive fields “evolve” from brighter to lower light levels. The word “evolution” is used only for convenience and is not intended to suggest “biological evolution” since the model cannot account for all biological constraints. 
Results
Our model shows that optimal spatial and temporal receptive field properties depend on light intensity, image velocity, and on the anisotropy of the visual scene. An analysis of the optimal spatiotemporal receptive fields also reveals that summation improves the signal-to-noise ratio at lower spatiotemporal frequencies but sacrifices it at higher frequencies. 
In Figures 4A and 4D a single frame (10 ms) from a moving image sequence at various light intensities is shown for an isotropic and an anisotropic scene, respectively ( α = 1.5 visual channel widths per 10 ms). The spatial “evolution” of the optimal receptive field is shown in Figures 4B and 4E for gradually decreasing light intensities (from left to right). In bright light conditions spatial sampling only includes a single visual channel, and for the isotropic scene spatial summation involves more and more neighboring visual channels as light intensity falls. In the case of an anisotropic scene the spatial receptive field “evolves” an anisotropy with summation more pronounced in the vertical direction. This anisotropy can only be observed at lower light intensities when the sampling begins to sum over multiple visual channels. At log I = 1.5 the isotropic receptive field includes 27 channels with at least 1% contribution (compared to the visual channel at the center of the receptive field). At a comparable light level (log I = 0.8) the anisotropic receptive field includes only 23 channels. Interestingly, decreasing the light level by one further log unit inverts the situation: 61 channels contribute to the spatial summation in the isotropic case (log I = 0.5; at least 1% contribution), while 87 visual channels contribute in the anisotropic case (log I = −0.2; see also Supplementary Figure 3). Single image frames of the optimally spatiotemporally filtered image sequences are shown in Figures 4C and 4F, respectively. For any given image velocity a decrease in light intensity gradually increases spatial summation and temporal integration ( Figure 5, α = 1.5; see also Supplementary Figures 1A and 2A). 
Figure 4
 
Optimal spatial receptive fields at various light intensities (A–C) for an isotropic scene and (D–F) for an anisotropic scene ( α = 1.5 receptor widths per 10 ms). (A) Single image frames at various light intensities for the isotropic case (each image frame corresponds to a 10-ms sample, size 128 × 128 pixels). (B) An array of 51 × 51 visual channels and their spatial contribution during summation. The sum of all weighting coefficients of the receptive field is normalized to unity. (C) Optimally spatiotemporally filtered image frames from (A). The images are derived by filtering the whole noisy image sequence g k,l,t ( α,I) with the optimal spatial receptive field G k,l, and the optimal temporal response function V t (cf. Equation 5). (D–F) The same as (A)–(C) but for an anisotropic scene. The resulting spatial receptive fields show a distinct anisotropy at low light intensities.
Figure 4
 
Optimal spatial receptive fields at various light intensities (A–C) for an isotropic scene and (D–F) for an anisotropic scene ( α = 1.5 receptor widths per 10 ms). (A) Single image frames at various light intensities for the isotropic case (each image frame corresponds to a 10-ms sample, size 128 × 128 pixels). (B) An array of 51 × 51 visual channels and their spatial contribution during summation. The sum of all weighting coefficients of the receptive field is normalized to unity. (C) Optimally spatiotemporally filtered image frames from (A). The images are derived by filtering the whole noisy image sequence g k,l,t ( α,I) with the optimal spatial receptive field G k,l, and the optimal temporal response function V t (cf. Equation 5). (D–F) The same as (A)–(C) but for an anisotropic scene. The resulting spatial receptive fields show a distinct anisotropy at low light intensities.
Figure 5
 
Optimum spatiotemporal receptive field parameters for α = 1.5. (A) Isotropic image sequence (as used in Figure 4A). (B) Anisotropic image sequence (as used in Figure 4D). The dashed lines correspond to no spatial/temporal pooling, which includes only a single visual channel and/or a single image frame (10 ms).
Figure 5
 
Optimum spatiotemporal receptive field parameters for α = 1.5. (A) Isotropic image sequence (as used in Figure 4A). (B) Anisotropic image sequence (as used in Figure 4D). The dashed lines correspond to no spatial/temporal pooling, which includes only a single visual channel and/or a single image frame (10 ms).
For the isotropic scene our model predicts an almost isotropic spatial receptive field with similar amounts of summation in the vertical and horizontal directions ( Figure 5A). However, for the image sequence with α = 1.5 there is slightly more summation in the horizontal direction (i.e., in the direction of image motion). At lower light levels the ratio between vertical and horizontal pooling is less than 1 (Δ ρ vρ h = 0.8, log I = −0.5). This is due to image blur, which is introduced at the level of the visual channels, and horizontal summation increases monotonically as image velocity α increases. For α = 1.5 temporal integration increases with decreasing light intensity in a similar fashion to spatial pooling. In this particular case this indicates a balanced summation in both space and time. 
For the anisotropic scene, which is characterized by an increased power in the horizontal direction ( Figure 3B), our model predicts significantly increased vertical summation ( Figure 5B). The scene contains trunks of trees in a forest and these vertical structures cause a high degree of correlation between the intensities of vertically aligned image pixels. And despite motion blur (in the horizontal direction), the ratio between the optimal vertical and horizontal receptive field half-widths at lower light levels is greater than 1 (Δ ρ vρ h = 7.8, log I = −1.2). The onset of temporal integration occurs for α ≥ 1.5 only at light intensities lower than log I = 0.8 ( Figure 5B; Supplementary Figure 2A). 
The influence of the light intensity I, however, strongly depends on the image velocity α. At very low image velocities the receptive field almost exclusively sums temporally ( Figure 6, anisotropic image). Although this behavior can be observed for both the isotropic and anisotropic scenes, temporal integration is less extensive for the latter (cf. Supplementary Figures 1 and 2). For the anisotropic scene an increase in image velocity α leads to an asymptotic increase of summation in the vertical direction ( Figure 6A), whereas summation in the horizontal direction continuously increases ( Figure 6B). This is due to motion blur, which increases as α increases. Furthermore, increasing the image velocity at low light intensities leads to a strong reduction of temporal integration ( Figure 6C). 
Figure 6
 
Optimum spatiotemporal receptive field parameters for the anisotropic scene at various light intensities I and image velocities α ( α = 0.1, 0.2, …, 4.0 visual channel widths per 10 ms). (A) Vertical half-width Δ ρ v. (B) Horizontal half-width Δ ρ h. (C) Temporal integration time Δ t.
Figure 6
 
Optimum spatiotemporal receptive field parameters for the anisotropic scene at various light intensities I and image velocities α ( α = 0.1, 0.2, …, 4.0 visual channel widths per 10 ms). (A) Vertical half-width Δ ρ v. (B) Horizontal half-width Δ ρ h. (C) Temporal integration time Δ t.
The criterion for the “optimal” receptive field was given as the mean square error ( Equations 6 and 7), whose minimum value gives the “best” prediction of the original scene. However, what does this criterion mean for the performance of the visual system? To address this question, spatiotemporally filtered images were analyzed in the frequency domain. Not surprisingly, the signal-to-noise ratio SNR opt = SNR opt( f s) for the optimally filtered images drops over the entire range of spatial frequencies f s as light intensity falls ( Figure 7A; see 4 for the calculation of signal and noise). The model thus indicates that vision inevitably deteriorates at lower and lower light levels even with the optimal pooling strategy as we define it here. So what is the advantage of spatial and temporal summation? To answer this question we compared the optimally filtered scenes with the corresponding unfiltered scene for each light level. As can be seen in Figure 7B, the ratio between SNR opt and SNR unfilt increases at lower spatial frequencies but drops at higher frequencies. Spatiotemporal summation increases signal power at lower frequencies and blurs finer image details (at higher frequencies), thus reducing high frequency signal power. The influence of spatiotemporal summation on noise power, on the other hand, is to decrease it over the entire frequency range (not shown). At higher frequencies, however, signal power is suppressed more than noise power. As a result, signal-to-noise ratio is improved at low frequencies but reduced at higher frequencies. Thus, the effect of spatiotemporal summation at dimmer light levels is to increase the reliability of coarser spatial details, while sacrificing finer spatial details ( Figure 7B). The same behavior can be observed in the temporal frequency domain (not shown): at dimmer light levels the reliability of slower details is increased, while faster details are sacrificed. 
Figure 7
 
Optimal spatiotemporal filtering and visual performance for an isotropic scene ( α = 1.5). (A) Signal-to-noise ratio for the isotropic scene ( Figure 3A) filtered by optimal spatiotemporal receptive fields. (B) Ratio between the signal-to-noise ratios of the optimally filtered and the unfiltered image sequences. A decreasing cut-off frequency at lower light intensities indicates that SNR opt( f s) is improved at lower spatial frequencies f s at the cost of SNR opt( f s) at higher frequencies f s. (Note that at log I = −0.5 the increase in SNR opt( f s) and SNR opt( f s)/SNR unfilt( f s) at high frequencies is an artifact that originates from the Hann window.)
Figure 7
 
Optimal spatiotemporal filtering and visual performance for an isotropic scene ( α = 1.5). (A) Signal-to-noise ratio for the isotropic scene ( Figure 3A) filtered by optimal spatiotemporal receptive fields. (B) Ratio between the signal-to-noise ratios of the optimally filtered and the unfiltered image sequences. A decreasing cut-off frequency at lower light intensities indicates that SNR opt( f s) is improved at lower spatial frequencies f s at the cost of SNR opt( f s) at higher frequencies f s. (Note that at log I = −0.5 the increase in SNR opt( f s) and SNR opt( f s)/SNR unfilt( f s) at high frequencies is an artifact that originates from the Hann window.)
The cut-off frequency at which SNR opt/SNR unfilt drops below 1 indicates the frequency at which signal-to-noise ratio is sacrificed at higher frequencies in order to improve the signal-to-noise ratio at lower frequencies. At log I = 3.5, the cut-off frequency is f s = 56 cycles/image but decreases to 18 cycles/image as light intensity falls to log I = −0.5. 
Discussion
Nocturnally active animals possessing apposition compound eyes show several optical adaptations to improve their visual sensitivity in dim light (Barlow et al., 1987; Barlow & Powers, 2003; Greiner, 2006; Land, 1981; Nilsson, 1989). In addition to optical adaptations, monopolar cells in the first optic ganglion (the lamina) are thought to mediate spatial summation of visual signals from several neighboring ommatidia to further improve visual reliability (Greiner, 2006; Warrant, 2008b). A strategy of neural summation can increase visual reliability because adjacent points in natural images—and thus the input signals of adjacent visual channels—are correlated and share some amount of information (Smith, 1995; Srinivasan et al., 1982; Tsukamoto et al., 1990; van der Schaaf & van Hateren, 1996; van Hateren, 1992c). 
Here, we modeled an array of visual channels to derive the optimum spatiotemporal sampling of natural images for gradually decreasing levels of light intensity. Consistent with previous theoretical studies (Theobald et al., 2006; van Hateren, 1992a, 1992c; Warrant, 1999), our model consistently predicts an increase in spatial and temporal summation as light intensity falls (Figure 5, Supplementary Figures 1A and 2A). The optimum combination of spatial and temporal summation also depends on the amount of motion that is present in the image sequences (Figure 6, Supplementary Figures 1B and 2B). Most importantly, the optimum spatial and temporal properties account for differences in the image anisotropy in order to give the best reconstruction of the initial, noiseless image sequences (Figure 4). 
Image anisotropy and spatial summation
Studies of the power spectral density of natural images show that correlation in natural scenes is, on average, almost equal in the vertical and horizontal directions, respectively (Balboa & Grzywacz, 2003; van der Schaaf & van Hateren, 1996). Individual scenes, or scenes derived from specific habitats, can exhibit, however, a distinct anisotropy (Balboa & Grzywacz, 2003; van der Schaaf & van Hateren, 1996). In that case the input correlation between a visual channel and its neighboring channels would vary with orientation and an optimal sampling of anisotropic images should take that into account (Hosoya, Baccus, & Meister, 2005). Indeed, at low light levels the spatial component of our receptive field model “evolves” a pronounced anisotropy with up to 8-fold greater summation in the vertical than in the horizontal direction (Figure 4E, Supplementary Figure 2A): at low light intensities the ratio Δρvρh is always greater than 1. At moderate light intensities and for high image velocities, however, horizontal summation dominates vertical summation due to motion blur. An increased summation in the horizontal direction, that is, in the direction of image motion, is also present for the isotropic scene. Not surprisingly, for the isotropic image sequence, the ratio Δρvρh for optimum spatiotemporal receptive fields is less than or equal to 1 for all light levels I and image velocities α (cf. Supplementary Figure 1A). 
In the retina of some amphibians and mammals it has recently been reported that some ganglion cells adapt to oriented stimuli over a time scale of several seconds (Hosoya et al., 2005). This mechanism might also be found in insects since adaptation in retinal processing on a short time scale has not only been reported for mammalian and amphibian species (Smirnakis, Berry, Warland, Bialek, & Meister, 1997) but also for insects (de Ruyter van Steveninck, Bialek, Potters, & Carlson, 1994). Adaptation to oriented stimuli on a short time scale might be mediated by synaptic changes. Since such a mechanism would exploit the correlation between neighboring input signals (Hosoya et al., 2005), photon noise constitutes an inherent limitation for such a strategy. Another possible explanation for the adaptation to oriented stimuli is pattern-selective retinal interneurons (Bloomfield, 1994; Hosoya et al., 2005). Such neurons—and possibly also the LMC cells in Megalopta—might represent the neural substrate mediating orientation selectivity or anisotropic spatial summation, respectively. 
The advantage of independent summation in different orientations becomes evident when we consider the optimum values of a receptive field, which is constrained by the additional condition Δ ρ v = Δ ρ h. For example for the anisotropic image sequence with α = 1.5 (at log I = −1.2), the optimization scheme in Equations 6 and 7, with the constraint Δ ρ v = Δ ρ h, yields a spatial half-width Δ ρ of 6.2 channel widths (which includes 193 visual channels); this value is between the independently optimized half-widths Δ ρ v = 22.5 and Δ ρ h = 2.9 channel widths (with 337 visual channels in total). The isotropically constrained receptive field model (Δ ρ v = Δ ρ h) results in too much summation in the horizontal direction and too little summation in the vertical direction, which together degrades the summed output signal. Furthermore, in the anisotropic case the integration time Δ t increases from 19 to 25 ms in order to counterbalance the reduced spatial summation (193 instead of 337 channels). The longer Δ t, however, cannot fully compensate for the predictive loss indicated by an increase of the MSE value for the constrained model (data not shown). This together shows that the shape of the receptive field should match the anisotropy of the moving stimulus. Within the framework of the model presented here, and the optimization scheme we used in this study (see Theory and methods section), this is indeed possible—the receptive field model allows the optimum combination of spatial and temporal parameters to “evolve” for each combination of I and α
Spatial and temporal summation and the influence of image motion
The optimum spatiotemporal parameters are strongly influenced by the light intensity I and the image velocity α. A decrease in light intensity increases spatial and temporal summation ( Figure 5, Supplementary Figures 1A and 2A). Image motion, however, has a different effect on the spatial and temporal parameters, respectively ( Figure 6). Increasing image motion decreases Δ t, and to counteract the loss in temporal summation the eye has to sum more extensively in space. Therefore, a shortening of the temporal integration time comes with an increase in summation in space; and vice versa: lengthening Δ t decreases Δ ρv,h ( Supplementary Figures 1B and 2B). Thus, the relative balance between spatial and temporal summation is likely to be strongly influenced by ecological and behavioral constraints: slow animals are likely to invest more heavily in temporal summation than faster ones (Warrant, 1999). 
An excellent case example is the fast flying nocturnal bee Megalopta genalis, for which visual navigation plays an important role in homing (Warrant et al., 2004). The integration time and acceptance angle of dark-adapted photoreceptors are both large in Megalopta; 32 ms and 5.6°, respectively (compared to 18 ms and 2.6°, respectively, in the diurnal honeybee Apis mellifera; Warrant et al., 2004). This coarser spatial and temporal resolution, however, cannot on its own explain Megalopta's behavior at night. Since the dark-adapted integration time is not exceptionally slow, spatial summation in the lamina is likely to further improve Megalopta's visual sensitivity. This hypothesis is supported by morphological evidence in the lamina of M. genalis (Greiner et al., 2005; Greiner, Ribi, Wcislo et al. 2004; Warrant et al., 2004) and by theoretical predictions (Theobald et al., 2006; Warrant, 1999). Our model confirms that spatial summation should be more extensive for greater image motion, as would be experienced by Megalopta during rapid forward flight (cf. Figure 6). To be “optimal” such an extension of spatial summation should account for differences in the image anisotropy (see above). Vertical structures such as tree trunks can change the image statistics sufficiently to depart markedly from isotropy (cf. Figure 3; but see also van der Schaaf & van Hateren, 1996). Although the visual environment of Megalopta's habitat has not been investigated systematically, the average image statistics of rainforest scenes is not likely to be isotropic. A strategy of anisotropic neural summation might therefore benefit Megalopta during its nightly navigation flights in the rainforests of Central and South America. Indeed, the dendritic fields of its L4 lamina monopolar cells are anisotropic in shape, being strongly elongated in the vertical direction (Greiner et al., 2005). However, whether the L4 cells are involved in anisotropic spatial summation is yet to be determined. 
Measurements of the ambient light intensity have shown that Megalopta can find its nest entrance when as few as 4.7 photons per second (log I = 0.67) are absorbed on average by a single green photoreceptor (Warrant et al., 2004). At this and somewhat higher light levels, Megalopta also flies through the rainforest, navigating between the nest and foraging sites. The dark-adapted integration time of Megalopta is 32 ms (see above). For the isotropic image (logI = 0.5) our simulations predict this integration time to be optimal for an image velocity α ∼ 0.7 receptor widths per 10 ms (cf. Supplementary Figure 1B). In the anisotropic case (logI = 0.8) we found this integration time to be optimal for α ∼ 0.2–0.3 receptor widths per 10 ms, that is, for much slower image velocities compared to the isotropic image (cf. Supplementary Figure 2B). In the latter case, an increase of α above 0.3 receptor widths per 10 ms would result in additional motion blur for the given integration time of 32 ms. In other words, the dynamics of the temporal receptive field would be too slow for α > 0.3, resulting in an impairment of spatial and temporal resolution (Batra & Barlow, 1990; Juusola & French, 1997; Srinivasan & Bernard, 1975). 
Extent of spatiotemporal summation and visual performance
The optimization scheme that we follow in Equations 6 and 7, that is, predictions based on comparisons of the original and noiseless image sequences, favors spatiotemporal summation up to a point where the MSE value has its minimum. Compared to a receptive field that does not sum in space and/or time, the optimum receptive field improves the signal-to-noise ratio of its output signals at low frequencies. This improvement, however, comes only by sacrificing the signal-to-noise ratio at higher frequencies (Figure 7B). This indicates a trade-off between visual reliability and spatiotemporal resolution (Batra & Barlow, 1990; Land, 1997). The trade-off can be influenced by shifting the cut-off frequency (see Results section) toward higher or lower frequencies, that is, by reducing or increasing the amount of spatiotemporal summation, respectively (Figure 8). At this point it is interesting to consider the influence of the anisotropic receptive field model on the vertical and the horizontal signal-to-noise ratio, respectively. For the optimum spatiotemporal receptive field, the cut-off frequency in the vertical direction is lower than in the horizontal direction. Inspection of the power spectral density of the anisotropic image (Figure 3B) shows that there is more horizontal than vertical power in the image. The optimum receptive field tries to preserve as much power as possible in both directions. Since image details are richer in the horizontal direction, a higher cut-off frequency results for the horizontal compared to the vertical direction. For example, at logI = −0.2 and for α = 1.5 channel widths per 10 ms the vertical cut-off is at 13 cycles/image in contrast to the horizontal cut-off at 34 cycles/image (Figure 8B). This clearly shows the advantage of an optimally adapted summation in the respective directions. 
Figure 8
 
Different degrees of filtering at logI = −0.2, α = 1.5. (A) Image frames of 10 ms, from left to right: unfiltered, too little filtering (12Δρv,h, 12Δt), optimal filtering (Δρv,h, Δt), and too much filtering (2Δρv,h, 2Δt). Here, Δρv,h and Δt denote the optimum spatiotemporal parameters at logI = −0.2 and for α = 1.5 channel widths per 10 ms (cf. Equations 6 and 7). (B) The corresponding signal-to-noise ratio curves in the vertical and horizontal directions (SNRv and SNRh, respectively). The optimum spatiotemporal receptive field preserves finer image details in the horizontal direction as indicated by a higher cut-off frequency. The cut-off frequencies are 19 (too little), 13 (optimal), and 9 cycles/image (too much filtering) for SNRv, and 44 (too little), 34 (optimal), and 22 cycles/image (too much filtering) for SNRh.
Figure 8
 
Different degrees of filtering at logI = −0.2, α = 1.5. (A) Image frames of 10 ms, from left to right: unfiltered, too little filtering (12Δρv,h, 12Δt), optimal filtering (Δρv,h, Δt), and too much filtering (2Δρv,h, 2Δt). Here, Δρv,h and Δt denote the optimum spatiotemporal parameters at logI = −0.2 and for α = 1.5 channel widths per 10 ms (cf. Equations 6 and 7). (B) The corresponding signal-to-noise ratio curves in the vertical and horizontal directions (SNRv and SNRh, respectively). The optimum spatiotemporal receptive field preserves finer image details in the horizontal direction as indicated by a higher cut-off frequency. The cut-off frequencies are 19 (too little), 13 (optimal), and 9 cycles/image (too much filtering) for SNRv, and 44 (too little), 34 (optimal), and 22 cycles/image (too much filtering) for SNRh.
Arrangement of visual channels and receptive field centers
In this study we investigated the optimum spatiotemporal parameters of identical receptive fields. In our model, the spacing between the centers of the receptive fields was equal to the spacing of the visual channels and this was not subject to variation (see Theory and methods section). However, the actual arrangement of the photoreceptors and/or of the receptive field centers in the retina are important determinants of the information that is sent to later stages of the visual system (Borghuis, Ratliff, Smith, Sterling, & Balasubramanian, 2008; Tsukamoto et al., 1990). Changes in the densities and optical properties of photoreceptors influence visual acuity and information capacity both in compound eyes (Land, 1997; Land & Eckert, 1985) and camera-type eyes (Packer, Hendrickson, & Curcio, 1989; Sterling & Demb, 2004). For instance, a higher cone density improves the signal-to-noise ratio of retinal ganglion cells (Tsukamoto et al., 1990), and the mosaic arrangement of overlapping ganglion cell receptive fields in mammals probably prevents aliasing artifacts and helps to provide a uniform spatial sensitivity (de Vries & Baylor, 1997). Moreover, an optimal overlap between ganglion cell receptive fields reduces the amount of redundancy in the visual signal, thereby maximizing the amount of information per receptive field (Borghuis et al., 2008; Laughlin, 1994). Taken together, the results of Borghuis et al. (2008), Tsukamoto et al. (1990), and those of our study indicate that the spatial arrangement of the visual channels (and of their receptive fields), in addition to their spatiotemporal properties, can be optimized in order to maximize the amount of (relevant) information that is sent to later stages of visual processing. For the visual system as a whole, however, all its components must be correspondingly tuned in order to optimally support visually guided behavior. 
Conclusion
Light intensity and image motion are two important parameters that partly determine the spatial and temporal sampling properties of visual receptive fields. In addition, our model predicts that the average image anisotropy of the scene being viewed by an array of visual channels has a strong influence on the spatial—and therefore also temporal—sampling properties. Spatiotemporal summation improves the signal-to-noise ratio at low frequencies in order to reliably distinguish coarse and/or slow-moving structures. With a strategy of neural summation this comes, however, only at the cost of a deterioration in signal-to-noise ratio at higher frequencies. This means that fine and/or fast-moving image structures disappear either due to noise, or—in the case of spatiotemporal summation—by the summation of visual signals. Our results indicate that the optimum trade-off strongly depends on the image statistics. In terms of optimum summation it is therefore predicted that an anisotropic power spectral density requires anisotropic spatial summation. Thus, animals that live in environments that are highly anisotropic are predicted to spatially sum visual signals in dim light in an anisotropic fashion. Finally, even though these conclusions have been derived from a model of an insect visual system, the results obtained in this study should apply to any visual system that operates under similar intensity constraints. 
Supplementary Materials
Supplementary Figure 1 - Supplementary Figure 1 
Supplementary Figure 1. Optimum spatiotemporal receptive field parameters for the isotropic scene at various light intensties I and for different image velocities α. Optimal parameters are obtained as described in Theory and methods, that is, for each simulation α was constant and I was gradually decreased. A. Spatial and temporal parameters plotted as a function of (logarithmic) light intensity I for image velocities α = 0.5, 1.5, 2.5 and 3.5 visual channel widths per 10 ms. B. Spatial and temporal parameters plotted as a function of image velocity α (log I = 4.5, 3.5,…, −0.5). 
Supplementary Figure 2 - Supplementary Figure 2 
Supplementary Figure 2. Optimum spatiotemporal receptive field parameters for the anisotropic scene at various light intensties I and for different image velocities α. Optimal parameters are obtained as described in Theory and methods, that is, for each simulation α was constant and I was gradually decreased. A. Spatial and temporal parameters plotted as a function of (logarithmic) light intensity I for image velocities α = 0.5, 1.5, 2.5 and 3.5 visual channel widths per 10 ms. B. Spatial and temporal parameters plotted as a function of image velocity α (log I = 3.8, 2.8,…, −1.2). 
Supplementary Figure 3 - Supplementary Figure 3 
Supplementary Figure 3. The number of visual channels which contribute to the spatial part of the optimum receptive field for the isotropic and the anisotropic case, respectively (only visual channels with at least 1% contribution were counted). In the anisotropic case the total number of visual channels increases rapidly when image motion α is high. This strong increase at very low light intensities is mostly due to vertical summation and counteracts the effects of shortening the integration time Δt at high values of α (see Results). 
Supplementary Figure 4 - Supplementary Figure 4 
Supplementary Figure 4. The correlation between the original image pixel values and the noisy pixel values at various light intensities for the isotropic image (Figure 3A) and the anisotropic image (Figure 3B), respectively. For the two images in Figure 3 the correlation for the same mean number of photons per visual channel per second is higher in the anisotropic case. For this reason the lowest light intensity we considered was log I = −1.2 for the anisotropic image, in contrast to log I = −0.5 for the isotropic image (as indicated by the intersections with the dashed line). 
Appendix A
Data generation
Given that the natural image data can be specified by an array of intensity values g k,l, we first define a function
g ˜
( u, v) with real-valued variables u and v in the following way:  
g ˜ ( u , v ) : = g u , v ,
(A1)
where ⌊·⌋ denotes the floor function. The sequence of images with motion in a horizontal direction is then calculated by  
g k , l , t ( α ) = a b w ( ξ ) g ˜ ( k , ξ ) d ξ ,
(A2)
where a = αt defines the left “edge” of a visual channel at time t = 0, …, T − 1 and b = αt + α + 1 defines the right “edge” after moving ( t + 1). 
For motion with a constant velocity the function w = w( u) is defined as  
w ( u ) = { w max u 1 a ( u a ) a u < u 1 w max u 1 u < u 2 w max u 2 b ( u b ) u 2 u < b 0 otherwise ,
(A3)
where u 1 = min{ αt + 1, αt + α}, u 2 = max{ αt + 1, αt + α}, and w max = 1/max{1, α}. It follows that  
w ( u ) d u = a b w ( u ) d u = 1 .
(A4)
 
For all calculations, image sequences are normalized with their mean intensity to unity:  
g ^ k , l , t = g k , l , t i , j = 0 M 1 n = 0 T 1 g i , j , n ,
(A5)
for k, l = 0, …, M − 1 and t = 0, …, T − 1. M is the size of each image in the sequence
g ^
k,l,t. If not otherwise mentioned we used M = 128 and T = 32 + N = 42 (cf. Equation 6) in our simulations. 
Appendix B
Spatial and temporal filtering
The discrete version of the Gaussian kernel G( u, v) ( Equation 2) is given by  
G k , l = G ( k , l ) i , j = M / 2 + 1 M / 2 G ( i , j ) ,
(B1)
k, l = − M/2 + 1, …, M/2. The sum of all entries in G k,l is normalized to unity. For the temporal response function (i.e., Equation 3) we proceed in a similar way:  
V t = V ( t + t 0 τ p ) i = 0 T 1 V ( i + t 0 τ p ) ,
(B2)
t = 0, …, T − 1. In this case we shift the peak value of V t to the t 0th time step ( t 0 = ⌊ T/3⌋). In addition to Equation 3 we assume V( t) = 0 for t ≤ 0. 
The convolution of an image sequence g = g k,l,t with a Gaussian kernel G = G k,l in Equation 5 is calculated as follows:  
( G * g ) k , l , t = i , j = M / 2 + 1 M / 2 G i , j g k + i , l + j , t ,
(B3)
and yields a spatially filtered image sequence of the same size as the input sequence. For the calculation of Equation B3 we used zero-padded images g k,l,t, t = 0, 1, 2, … (i.e., g k,l,t := 0 for k, l < 0 and k, lM) and choose M to be a power of 2. This allows a computationally efficient calculation of Equation B3 by means of the fast Fourier transform (see, e.g., Press et al., 2007). 
Appendix C
Power spectrum calculation
For an image g k,l, k, l = 0, …, M − 1, of size M, we want to assume that
g ^
m,n is its Fourier transform. The Fourier transform
g ^
m,n is a periodic function of which we will consider the range M/2 ≤ m, n < M/2 ( M is a power of 2). This means that the zero-frequency component is at m = n = 0. The two-dimensional power spectrum of the image g k,l is then given by ∣
g ^
m,n2 ( Figure C1B). To avoid padding artifacts a Hann window of size M × M is applied before the calculation of the Fourier transform (Press et al., 2007). 
Figure C1
 
Calculation of the one-dimensional power spectrum. (A) A grayscale image with Hann window. (B) The corresponding two-dimensional, logarithmically scaled power spectrum with zero frequency at the center pixel. To calculate the power at a certain frequency fs the (unscaled) two-dimensional power is averaged around a circle with radius fs (360°, light blue). To calculate the power in horizontal and vertical directions the two-dimensional power is averaged only for 23° sectors of a circle (light green and red arcs, respectively). Note that the two-dimensional power spectrum is logarithmically scaled for visualization only and averages are calculated from the unscaled power spectrum. (C) The one-dimensional power spectrum with zero frequency (fs = 0) scaled to unity. Asterisks indicate the frequencies as used in (B) for illustration.
Figure C1
 
Calculation of the one-dimensional power spectrum. (A) A grayscale image with Hann window. (B) The corresponding two-dimensional, logarithmically scaled power spectrum with zero frequency at the center pixel. To calculate the power at a certain frequency fs the (unscaled) two-dimensional power is averaged around a circle with radius fs (360°, light blue). To calculate the power in horizontal and vertical directions the two-dimensional power is averaged only for 23° sectors of a circle (light green and red arcs, respectively). Note that the two-dimensional power spectrum is logarithmically scaled for visualization only and averages are calculated from the unscaled power spectrum. (C) The one-dimensional power spectrum with zero frequency (fs = 0) scaled to unity. Asterisks indicate the frequencies as used in (B) for illustration.
To reduce the two frequency components m and n to a single spatial frequency f s, values of the power spectrum
P
( f s) are averaged on circles of radius f s = 0, …, M/2 − 1 around the zero frequency at m = n = 0 (cf. the light blue circle in Figure C1B; [·] returns the nearest integer):  
P ( f s ) = f s = [ m 2 + n 2 ] | g ^ m , n | 2 .
(C1)
 
To include values of the power spectrum only in the horizontal or vertical direction, the average is taken from an arc of a circle only (cf. the light green and red arcs in Figure C1B). The corresponding one-dimensional power spectrum averages are shown in Figure C1C
Appendix D
Image analysis
Here, we defined the signal
S
( f s) as the power spectrum of the average of N filtered images:  
S ( f s ) = f s = [ m 2 + n 2 ] | S ^ m , n | 2 ,
(D1)
where [·] denotes the nearest integer function.
S ^
m,n is the Fourier transform of  
S k , l = 1 N i = 1 N h k , l , 0 ( i ) ,
(D2)
and h k,l,n ( i) denotes the ith instance of a filtered image sequence (cf. Equation 5). In the same way we calculate the noise
N
( f s), that is, as the power spectrum of the standard deviation:  
N k , l = 1 N 1 i = 1 N ( S k , l h k , l , 0 ( i ) ) 2 .
(D3)
 
In all examples we used N = 10 samples for the calculation of the signal-to-noise ratio SNR( f s) =
S
( f s)/
N
( f s). 
Acknowledgments
We thank Dr. Jamie Carroll Theobald for valuable and inspiring discussions. A.K. is grateful for the financial support given by the German Academic Exchange Service (DAAD). E.J.W. is grateful for the ongoing support of the Swedish Research Council (Vetenskapsrådet), the Royal Physiographic Society of Lund, and the Air Force Office of Scientific Research (AFOSR). 
Commercial relationships: none. 
Corresponding author: Andreas Klaus. 
Email: Andreas.Klaus@ki.se. 
Address: Department of Neuroscience, Karolinska Institute, Retzius väg 8, 17177 Stockholm, Sweden. 
References
Balboa, R. M. Grzywacz, N. M. (2003). Power spectra and distribution of contrasts of natural images from different habitats. Vision Research, 43, 2527–2537. [PubMed] [CrossRef] [PubMed]
Barlow, H. (2001). Redundancy reduction revisited. Network, 12, 241–253. [PubMed] [CrossRef] [PubMed]
Barlow, H. B. Fitzhugh, R. Kuffler, S. W. (1957). Change of organization in the receptive fields of the cat's retina during dark adaptation. The Journal of Physiology, 137, 338–354. [PubMed] [Article] [CrossRef] [PubMed]
Barlow, Jr., R. B. Bolanowski, Jr., S. J. Brachman, M. L. (1977). Efferent optic nerve fibers mediate circadian rhythms in the Limulus eye..
Barlow, Jr., R. B. Kaplan, E. Renninger, G. H. Saito, T. (1987). Circadian rhythms in Limulus photoreceptors. Circadian rhythms in Limulus photoreceptors..
Barlow, R. B. Powers, M. K. Shuster,, C. N. Barlow,, R. B. Brockmann, H. J. (2003). Seeing at night and finding mates: The role of vision. The American horseshoe crab. (pp. 83–102). Cambridge, Massachusetts, and London, England: Harvard University Press.
Batra, R. Barlow, Jr., R. B. (1982). Efferent control of pattern vision in Limulus lateral eye..
Batra, R. Barlow, Jr., R. B. (1990). Efferent control of temporal response properties of the Limulus lateral eye..
Bloomfield, S. A. (1994). Orientation-sensitive amacrine and ganglion cells in the rabbit retina. Journal of Neurophysiology, 71, 1672–1691. [PubMed] [PubMed]
Borghuis, B. G. Ratliff, C. P. Smith, R. G. Sterling, P. Balasubramanian, V. (2008). Design of a neuronal array. Journal of Neuroscience, 28, 3178–3189. [PubMed] [Article] [CrossRef] [PubMed]
Coppola, D. M. Purves, H. R. McCoy, A. N. Purves, D. (1998). The distribution of oriented contours in the real world. Proceedings of the National Academy of Sciences of the United States of America, 95, 4002–4006. [PubMed] [Article] [CrossRef] [PubMed]
de Ruyter van Steveninck, R. R. Bialek, W. Potters, M. Carlson, R. H. (1994). Statistical adaptation and optimal estimation in movement computation by the blowfly visual system IEEE International Conference on Systems, Man, and Cybernetics: ‘Humans, information and technology’ (vol. 1, pp. 302–307). San Antonio, TX, USA: IEEE.
de Vries, H. L. (1943). The quantum character of light and its bearing upon threshold of vision, the differential sensitivity and visual acuity of the eye. Physica, 10, 553–564. [CrossRef]
de Vries, S. H. Baylor, D. A. (1997). Mosaic arrangement of ganglion cell receptive fields in rabbit retina. Journal of Neurophysiology, 78, 2048–2060. [PubMed] [Article] [PubMed]
Dong, D. W. Atick, J. J. (1995). Statistics of natural time-varying images. Network: Computation in Neural Systems, 6, 345–358. [CrossRef]
Field, D. J. (1987). Journal of the Optical Society of America A, Optics and Image Science 4,. [.
Frederiksen, R. Warrant, E. J. (2008). Visual sensitivity in the crepuscular owl butterfly Caligo memnon and the diurnal blue morpho Morpho peleides: A clue to explain the evolution of nocturnal apposition eyes? Journal of Experimental Biology, 211, 844–851..
Frederiksen, R. Wcislo, W. T. Warrant, E. J. (2008). Visual reliability and information rate in the retina of a nocturnal bee. Current Biology, 18, 349–353. [PubMed] [Article] [CrossRef] [PubMed]
Greiner, B. (2006). Adaptations for nocturnal vision in insect apposition eyes. International Review of Cytology, 250, 1–46. [PubMed] [PubMed]
Greiner, B. Ribi, W. A. Warrant, E. J. (2004). Retinal and optical adaptations for nocturnal vision in the halictid bee Megalopta genalis..
Greiner, B. Ribi, W. A. Warrant, E. J. (2005). A neural network to improve dim-light vision? Dendritic fields of first-order interneurons in the nocturnal bee Megalopta genalis..
Greiner, B. Ribi, W. A. Wcislo, W. T. Warrant, E. J. (2004). Megalopta genalis. [.
Hansen, B. C. Essock, E. A. (2004). A horizontal bias in human visual processing of orientation and its correspondence to the structural components of natural scenes. Journal of Vision, 4, (12):5, 1044–1060, http://journalofvision.org/4/12/5/, doi:10.1167/4.12.5. [PubMed] [Article] [CrossRef]
Hartline, H. K. Wagner, H. G. Ratliff, F. (1956). Inhibition in the eye of limulus. The Journal of General Physiology, 39, 651–673. [PubMed] [Article] [CrossRef] [PubMed]
Hemilä, S. Lerber, T. Donner, K. (1998). Noise-equivalent and signal equivalent visual summation of quantal events in space and time. Visual Neuroscience, 15, 731–742. [PubMed] [CrossRef] [PubMed]
Hosoya, T. Baccus, S. A. Meister, M. (2005). Dynamic predictive coding by the retina. Nature, 436, 71–77. [PubMed] [CrossRef] [PubMed]
Howard, J. (1981).Temporal resolving power of the photoreceptors of Locusta migratoria..
Howard, J. Dubs, A. Payne, R. (1984). The dynamics of phototransduction in insects—A comparative study. Journal of Comparative Physiology A: Sensory, Neural, and Behavioral Physiology, 154, 707–718. [CrossRef]
Jander, U. Jander, R. (2002). Allometry and resolution of bee eyes (Apoidea. Arthropod Structure & Development, 30, 179–193. [PubMed] [CrossRef] [PubMed]
Juusola, M. French, A. S. (1997). Visual acuity for moving objects in first- and second-order neurons of the fly compound eye. Journal of Neurophysiology, 77, 1487–1495. [PubMed] [Article] [PubMed]
Kirschfeld, K. (1974). The absolute sensitivity of lens and compound eyes. Zeitschrift für Naturforschung C: Biosciences, 29, 592–596. [PubMed]
Land, M. F. Autrum, H. J. (1981). Optics and vision in invertebrates. Handbook of sensory physiology, VII-6B. (pp. 471–592). Berlin, Germany: Springer.
Land, M. F. (1997). Visual acuity in insects. Annual Review of Entomology, 42, 147–177. [PubMed] [CrossRef] [PubMed]
Land, M. F. Eckert, H. (1985). Maps of the acute zones of fly eyes. Journal of Comparative Physiology A: Sensory, Neural, and Behavioral Physiology, 156, 525–538. [CrossRef]
Land, M. F. Nilsson, D. (2002). Animal eyes. New York: Oxford University Press.
Laughlin, S. B. (1994). Matching coding, circuits, cells and molecules to signals: General principles of retinal design in the fly's eye. Progress in Retinal and Eye Research, 13, 165–196. [CrossRef]
Laughlin, S. B. de Ruyter van Steveninck, R. R. Anderson, J. C. (1998). The metabolic cost of neural information. Nature Neuroscience, 1, 36–41. [PubMed] [CrossRef] [PubMed]
Nilsson, D. Stavenga, D. G. Hardie, R. C. (1989). Optics and evolution of the compound eye. Facets of vision. (pp. 30–73). Berlin, Germany: Springer.
Niven, J. E. Andersson, J. C. Laughlin, S. B. (2007). Fly photoreceptors demonstrate energy-information trade-offs in neural coding. PLoS Biology, 5,
Packer, O. Hendrickson, A. E. & Curcio, C. A. (1989). Photoreceptor topography of the retina in the adult pigtail macaque (Macaca nemestrina). Journal of Comparative Neurology, 288,165–183. [PubMed]
Payne, R. Howard, J. (1981). Response of an insect photoreceptor: A simple log-normal model. Nature, 290, 415–416. [CrossRef]
Press, W. H. Teukolsky, S. A. Vetterling, W. T. Flannery, B. P. (2007). Numerical recipes: The art of scientific computing. New York: Cambridge University Press.
Renninger, G. H., Barlow, R. B. (1979).Lateral inhibition, excitation, and the circadian rhythm of the Limulus compound eye..
Rose, A. (1942). The relative sensitivities of television pickup tubes, photographic film, and the human eye. Proceedings of the IRE, 30, 293–300. [CrossRef]
Roubik, D. W. (1989). Ecology and natural history of tropical bees. Cambridge: Cambridge University Press.
Simoncelli, E. P. Olshausen, B. A. (2001). Natural image statistics and neural representation. Annual Review of Neuroscience, 24, 1193–1216. [PubMed] [CrossRef] [PubMed]
Smirnakis, S. M. Berry, M. J. Warland, D. K. Bialek, W. Meister, M. (1997). Adaptation of retinal processing to image contrast and spatial scale. Nature, 386, 69–73. [PubMed] [CrossRef] [PubMed]
Smith, R. G. (1995). Simulation of an anatomically defined local circuit: The cone-horizontal cell network in cat retina. Visual Neuroscience, 12, 545–561. [PubMed] [CrossRef] [PubMed]
Snyder, A. W. (1979). Physics of vision in compound eyes. Handbook of Sensory Physiology, 7, 225–313.
Srinivasan, M. V. Bernard, G. D. (1975). The effect of motion on visual acuity of compound eye: A theoretical analysis. Vision Research, 15, 515–525. [PubMed] [CrossRef] [PubMed]
Srinivasan, M. V. Laughlin, S. B. Dubs, A. (1982). Predictive coding: A fresh view of inhibition in the retina. Proceedings of the Royal Society of London B: Biological Sciences, 216, 427–459. [PubMed] [CrossRef]
Stavenga, D. G. Kuiper, J. W. (1977). Insect pupil mechanisms I On the pigment migration in the retinula cells of Hymenoptera (suborder Apocrita. Journal of Comparative Physiology, 113, 55–72. [CrossRef]
Sterling, P. Demb, J. B. Shepherd, G. M. (2004). Retina. The synaptic organization of the brain. (pp. 217–269). New York: Oxford University Press.
Theobald, J. C. Greiner, B. Wcislo, W. T. Warrant, E. J. (2006). Visual summation in night-flying sweat bees: A theoretical study. Vision Research, 46, 2298–2309. [PubMed] [CrossRef] [PubMed]
Tsukamoto, Y. Smith, R. G. Sterling, P. (1990). “Collective coding” of correlated cone signals in the retinal ganglion cell. Proceedings of the National Academy of Sciences of the United States of America, 87, 1860–1864. [PubMed] [Article] [CrossRef] [PubMed]
van der Schaaf, A. van Hateren, J. H. (1996). Modelling the power spectra of natural images: Statistics and information. Vision Research, 36, 2759–2770. [PubMed] [CrossRef] [PubMed]
van Hateren, J. H. (1992a). A theory of maximizing sensory information. Biological Cybernetics, 68, 23–29. [PubMed] [CrossRef]
van Hateren, J. H. (1992b). Real and optimal neural images in early vision. Nature, 360, 68–70. [PubMed] [CrossRef]
van Hateren, J. H. (1992c). Theoretical predictions of spatiotemporal receptive fields of fly LMCs, and experimental validation. Journal of Comparative Physiology A, 171, 157–170. [CrossRef]
van Hateren, J. H. van der Schaaf, A. (1998). Independent component filters of natural images compared with simple cells in primary visual cortex. Proceedings of the Royal Society of London B: Biological Sciences, 265, 359–366. [PubMed] [Article] [CrossRef]
Warrant, E. J. (1999). Seeing better at night: Life style, eye design and the optimum strategy of spatial and temporal summation. Vision Research, 39, 1611–1630. [PubMed] [CrossRef] [PubMed]
Warrant, E. J. (2004). Vision in the dimmest habitats on earth. Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, 190, 765–789. [PubMed] [CrossRef]
Warrant, E. J. (2007). Nocturnal bees. Current Biology, 17, R991–R992. [PubMed] [Article] [CrossRef] [PubMed]
Warrant, E. J. Albright, T. Masland, R. H. (2008a). Nocturnal vision. The Senses: A comprehensive reference. (2, pp. 53–86). Oxford: Academic Press.
Warrant, E. J. (2008b). Seeing in the dark: Vision and visual behaviour in nocturnal bees and wasps. Journal of Experimental Biology, 211, 1737–1746. [PubMed] [Article] [CrossRef]
Warrant, E. J. Kelber, A. Gislén, A. Greiner, B. Ribi, W. Wcislo, W. T. (2004). Nocturnal vision and landmark orientation in a tropical halictid bee. Current Biology, 14, 1309–1318. [PubMed] [Article] [CrossRef] [PubMed]
Warrant, E. J. Porombka, T. Kirchner, W. H. (1996). Neural image enhancement allows honeybees to see at night. Proceedings of the Royal Society of London B: Biological Sciences, 263, 1521–1526. [CrossRef]
Wcislo, W. T. Arneson, L. Roesch, K. Gonzalez, V. Smith, A. Fernandez, H. (2004).The evolution of nocturnal behaviour in sweat bees, Megalopta genalis and M..
Figure 1
 
Compound eyes. (A) A schematic longitudinal section (and an inset of a transverse section) through a generalized Hymenopteran ommatidium (from an apposition eye, see B), showing the corneal lens (c), the crystalline cone (cc), the primary pigment cells (pc), the secondary pigment cells (sc), the rhabdom (rh), the retinula cells (rc), the basal pigment cells (bp), and the basement membrane (bm). The upper half of the ommatidium shows screening pigment granules in the dark-adapted state, while the lower half shows them in the light-adapted state. Redrawn from Stavenga and Kuiper (1977), with permission. (B) A focal apposition compound eye. Light reaches the photoreceptors exclusively from the small corneal lens (co) located directly above, within the same ommatidium. This eye design is typical of day-active insects. cc = crystalline cone. (C) A refracting superposition compound eye. A large number of corneal facets, and bullet-shaped crystalline cones of circular cross-section (inset), collect and focus light across the clear zone of the eye (c.z.) toward single photoreceptors in the retina. Several hundreds, or even thousands, of facets service a single photoreceptor. Not surprisingly, many nocturnal and deep-sea animals have refracting superposition eyes and benefit from the significant improvement in sensitivity. (B) and (C) Courtesy of Nilsson (1989).
Figure 1
 
Compound eyes. (A) A schematic longitudinal section (and an inset of a transverse section) through a generalized Hymenopteran ommatidium (from an apposition eye, see B), showing the corneal lens (c), the crystalline cone (cc), the primary pigment cells (pc), the secondary pigment cells (sc), the rhabdom (rh), the retinula cells (rc), the basal pigment cells (bp), and the basement membrane (bm). The upper half of the ommatidium shows screening pigment granules in the dark-adapted state, while the lower half shows them in the light-adapted state. Redrawn from Stavenga and Kuiper (1977), with permission. (B) A focal apposition compound eye. Light reaches the photoreceptors exclusively from the small corneal lens (co) located directly above, within the same ommatidium. This eye design is typical of day-active insects. cc = crystalline cone. (C) A refracting superposition compound eye. A large number of corneal facets, and bullet-shaped crystalline cones of circular cross-section (inset), collect and focus light across the clear zone of the eye (c.z.) toward single photoreceptors in the retina. Several hundreds, or even thousands, of facets service a single photoreceptor. Not surprisingly, many nocturnal and deep-sea animals have refracting superposition eyes and benefit from the significant improvement in sensitivity. (B) and (C) Courtesy of Nilsson (1989).
Figure 2
 
Modeling of image motion and light intensity. (A) Original image data of size 1536 × 1024 pixels (width × height) were taken from the natural image collection of van Hateren and van der Schaaf (1998). For stimulus creation and data analysis, square subimages of size M × M are considered. (B) A moving stimulus is created by horizontally shifting the array of visual channels (white square in A) over the image. The result is a sequence of image frames (t = 0, 1, 2,…). The time between two subsequent image frames was chosen to be 10 ms. (C) Superimposing Poisson noise results in a moving stimulus that resembles the scene gathered by the visual channels at a given light intensity.
Figure 2
 
Modeling of image motion and light intensity. (A) Original image data of size 1536 × 1024 pixels (width × height) were taken from the natural image collection of van Hateren and van der Schaaf (1998). For stimulus creation and data analysis, square subimages of size M × M are considered. (B) A moving stimulus is created by horizontally shifting the array of visual channels (white square in A) over the image. The result is a sequence of image frames (t = 0, 1, 2,…). The time between two subsequent image frames was chosen to be 10 ms. (C) Superimposing Poisson noise results in a moving stimulus that resembles the scene gathered by the visual channels at a given light intensity.
Figure 3
 
Image data of size 128 × 128 pixels. (A) An isotropic image and its one-dimensional and normalized power spectra. (For the calculation of the one-dimensional power spectrum, see 3.) Note that there is no distinct difference between the power in horizontal and vertical directions. (B) An anisotropic image and its one-dimensional and normalized power spectra with a considerable difference between horizontal and vertical power over a wide range of spatial frequencies.
Figure 3
 
Image data of size 128 × 128 pixels. (A) An isotropic image and its one-dimensional and normalized power spectra. (For the calculation of the one-dimensional power spectrum, see 3.) Note that there is no distinct difference between the power in horizontal and vertical directions. (B) An anisotropic image and its one-dimensional and normalized power spectra with a considerable difference between horizontal and vertical power over a wide range of spatial frequencies.
Figure 4
 
Optimal spatial receptive fields at various light intensities (A–C) for an isotropic scene and (D–F) for an anisotropic scene ( α = 1.5 receptor widths per 10 ms). (A) Single image frames at various light intensities for the isotropic case (each image frame corresponds to a 10-ms sample, size 128 × 128 pixels). (B) An array of 51 × 51 visual channels and their spatial contribution during summation. The sum of all weighting coefficients of the receptive field is normalized to unity. (C) Optimally spatiotemporally filtered image frames from (A). The images are derived by filtering the whole noisy image sequence g k,l,t ( α,I) with the optimal spatial receptive field G k,l, and the optimal temporal response function V t (cf. Equation 5). (D–F) The same as (A)–(C) but for an anisotropic scene. The resulting spatial receptive fields show a distinct anisotropy at low light intensities.
Figure 4
 
Optimal spatial receptive fields at various light intensities (A–C) for an isotropic scene and (D–F) for an anisotropic scene ( α = 1.5 receptor widths per 10 ms). (A) Single image frames at various light intensities for the isotropic case (each image frame corresponds to a 10-ms sample, size 128 × 128 pixels). (B) An array of 51 × 51 visual channels and their spatial contribution during summation. The sum of all weighting coefficients of the receptive field is normalized to unity. (C) Optimally spatiotemporally filtered image frames from (A). The images are derived by filtering the whole noisy image sequence g k,l,t ( α,I) with the optimal spatial receptive field G k,l, and the optimal temporal response function V t (cf. Equation 5). (D–F) The same as (A)–(C) but for an anisotropic scene. The resulting spatial receptive fields show a distinct anisotropy at low light intensities.
Figure 5
 
Optimum spatiotemporal receptive field parameters for α = 1.5. (A) Isotropic image sequence (as used in Figure 4A). (B) Anisotropic image sequence (as used in Figure 4D). The dashed lines correspond to no spatial/temporal pooling, which includes only a single visual channel and/or a single image frame (10 ms).
Figure 5
 
Optimum spatiotemporal receptive field parameters for α = 1.5. (A) Isotropic image sequence (as used in Figure 4A). (B) Anisotropic image sequence (as used in Figure 4D). The dashed lines correspond to no spatial/temporal pooling, which includes only a single visual channel and/or a single image frame (10 ms).
Figure 6
 
Optimum spatiotemporal receptive field parameters for the anisotropic scene at various light intensities I and image velocities α ( α = 0.1, 0.2, …, 4.0 visual channel widths per 10 ms). (A) Vertical half-width Δ ρ v. (B) Horizontal half-width Δ ρ h. (C) Temporal integration time Δ t.
Figure 6
 
Optimum spatiotemporal receptive field parameters for the anisotropic scene at various light intensities I and image velocities α ( α = 0.1, 0.2, …, 4.0 visual channel widths per 10 ms). (A) Vertical half-width Δ ρ v. (B) Horizontal half-width Δ ρ h. (C) Temporal integration time Δ t.
Figure 7
 
Optimal spatiotemporal filtering and visual performance for an isotropic scene ( α = 1.5). (A) Signal-to-noise ratio for the isotropic scene ( Figure 3A) filtered by optimal spatiotemporal receptive fields. (B) Ratio between the signal-to-noise ratios of the optimally filtered and the unfiltered image sequences. A decreasing cut-off frequency at lower light intensities indicates that SNR opt( f s) is improved at lower spatial frequencies f s at the cost of SNR opt( f s) at higher frequencies f s. (Note that at log I = −0.5 the increase in SNR opt( f s) and SNR opt( f s)/SNR unfilt( f s) at high frequencies is an artifact that originates from the Hann window.)
Figure 7
 
Optimal spatiotemporal filtering and visual performance for an isotropic scene ( α = 1.5). (A) Signal-to-noise ratio for the isotropic scene ( Figure 3A) filtered by optimal spatiotemporal receptive fields. (B) Ratio between the signal-to-noise ratios of the optimally filtered and the unfiltered image sequences. A decreasing cut-off frequency at lower light intensities indicates that SNR opt( f s) is improved at lower spatial frequencies f s at the cost of SNR opt( f s) at higher frequencies f s. (Note that at log I = −0.5 the increase in SNR opt( f s) and SNR opt( f s)/SNR unfilt( f s) at high frequencies is an artifact that originates from the Hann window.)
Figure 8
 
Different degrees of filtering at logI = −0.2, α = 1.5. (A) Image frames of 10 ms, from left to right: unfiltered, too little filtering (12Δρv,h, 12Δt), optimal filtering (Δρv,h, Δt), and too much filtering (2Δρv,h, 2Δt). Here, Δρv,h and Δt denote the optimum spatiotemporal parameters at logI = −0.2 and for α = 1.5 channel widths per 10 ms (cf. Equations 6 and 7). (B) The corresponding signal-to-noise ratio curves in the vertical and horizontal directions (SNRv and SNRh, respectively). The optimum spatiotemporal receptive field preserves finer image details in the horizontal direction as indicated by a higher cut-off frequency. The cut-off frequencies are 19 (too little), 13 (optimal), and 9 cycles/image (too much filtering) for SNRv, and 44 (too little), 34 (optimal), and 22 cycles/image (too much filtering) for SNRh.
Figure 8
 
Different degrees of filtering at logI = −0.2, α = 1.5. (A) Image frames of 10 ms, from left to right: unfiltered, too little filtering (12Δρv,h, 12Δt), optimal filtering (Δρv,h, Δt), and too much filtering (2Δρv,h, 2Δt). Here, Δρv,h and Δt denote the optimum spatiotemporal parameters at logI = −0.2 and for α = 1.5 channel widths per 10 ms (cf. Equations 6 and 7). (B) The corresponding signal-to-noise ratio curves in the vertical and horizontal directions (SNRv and SNRh, respectively). The optimum spatiotemporal receptive field preserves finer image details in the horizontal direction as indicated by a higher cut-off frequency. The cut-off frequencies are 19 (too little), 13 (optimal), and 9 cycles/image (too much filtering) for SNRv, and 44 (too little), 34 (optimal), and 22 cycles/image (too much filtering) for SNRh.
Figure C1
 
Calculation of the one-dimensional power spectrum. (A) A grayscale image with Hann window. (B) The corresponding two-dimensional, logarithmically scaled power spectrum with zero frequency at the center pixel. To calculate the power at a certain frequency fs the (unscaled) two-dimensional power is averaged around a circle with radius fs (360°, light blue). To calculate the power in horizontal and vertical directions the two-dimensional power is averaged only for 23° sectors of a circle (light green and red arcs, respectively). Note that the two-dimensional power spectrum is logarithmically scaled for visualization only and averages are calculated from the unscaled power spectrum. (C) The one-dimensional power spectrum with zero frequency (fs = 0) scaled to unity. Asterisks indicate the frequencies as used in (B) for illustration.
Figure C1
 
Calculation of the one-dimensional power spectrum. (A) A grayscale image with Hann window. (B) The corresponding two-dimensional, logarithmically scaled power spectrum with zero frequency at the center pixel. To calculate the power at a certain frequency fs the (unscaled) two-dimensional power is averaged around a circle with radius fs (360°, light blue). To calculate the power in horizontal and vertical directions the two-dimensional power is averaged only for 23° sectors of a circle (light green and red arcs, respectively). Note that the two-dimensional power spectrum is logarithmically scaled for visualization only and averages are calculated from the unscaled power spectrum. (C) The one-dimensional power spectrum with zero frequency (fs = 0) scaled to unity. Asterisks indicate the frequencies as used in (B) for illustration.
Supplementary Figure 1
Supplementary Figure 2
Supplementary Figure 3
Supplementary Figure 4
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×