**An influential theory of the function of early processing in the visual cortex is that it forms an efficient coding of ecologically valid stimuli. In particular, correlations and differences between visual signals from the two eyes are believed to be of great importance in solving both depth from disparity and binocular fusion. Techniques such as independent-component analysis have been developed to learn efficient codings from natural images; these codings have been found to resemble receptive fields of simple cells in V1. However, the extent to which this approach provides an explanation of the functionality of the visual cortex is still an open question. We compared binocular independent components with physiological measurements and found a broad range of similarities along with a number of key differences. In common with physiological measurements, we found components with a broad range of both phase- and position-disparity tuning. However, we also found a larger population of binocularly anticorrelated components than have been found physiologically. We found components focused narrowly on detecting disparities proportional to half-integer multiples of wavelength rather than the range of disparities found physiologically. We present the results as a detailed analysis of phase and position disparities in Gabor-like components generated by independent-component analysis trained on binocular natural images and compare these results to physiology. We find strong similarities between components learned from natural images, indicating that ecologically valid stimuli are important in understanding cortical function, but with significant differences that suggest that our current models are incomplete.**

^{1}These methods will have significant effects on the statistics of binocular images. For example, the convergence of the cameras determines how the disparity statistics will vary as a function of the image location. The details of the image-capture process are therefore repeated here. Images were captured using two Nikon Coolpix 4500 digital cameras, harnessed in a purpose-built mount that allowed the intercamera separation, and the orientation of each camera about a vertical axis, to be manipulated. This is a simplification of the situation for human binocular vision, in which there are potentially three degrees of freedom for each eye (rotations about horizontal and vertical axes, as well as the line of sight). The analyses presented here focus on situations in which vergence is approximately symmetrical and elevation is close to zero. In this case, the expected cyclovergence, which is not possible in the camera setup used, is negligible (e.g., Porrill, Ivins, & Frisby, 1999). In all cases, an intercamera separation of 65 mm (representative of the typical human interocular separation) was used. The cameras were oriented so that the same point in the scene projected to the center of each camera's image, so as to mimic the typical human fixation strategy.

*f*curve (where

^{α}*f*is the frequency and

*α*is a constant, normally ); whitening acts to flatten this curve, normalizing the responses at each frequency. As the signal strength is modulated by and noise strength is uniform in the frequency domain (assuming Gaussian white noise), noise dominates the signal at high frequencies (Atick & Redlich, 1992). We performed a low-pass filtering to remove the higher noise-dominated frequencies, by truncating the PCA model. A similar role in noise reduction at high frequencies has been proposed for the retina (Atick & Redlich, 1992). We found that 200 eigenvectors generated by PCA on the image patches explained a mean of 86.7% of the variance in the image patches. ICA is performed in the ℝ

^{200}space generated by the eigenvectors. Components were converted back to the image space by applying the inverse of the whitening matrix.

*L*

^{2}-norm between the function and the component. The 2-D Gabor function is defined as

*c*(

*x*,

*y*,

*f*,

*ϕ*,

*θ*) and the windowing function

*w*(

*x*,

*y*,

*σ*

_{w},

*σ*

_{h},

*ψ*) that constrains it in window space. The wave-generating function describes a cosine pattern with frequency

*f*and phase

*ϕ*, this pattern is rotated about the origin by an angle

*θ*. The windowing function constrains the image-space span of the wave-generating function to a Gaussian window of width

*σ*

_{w}and height

*σ*

_{h}; this window is rotated independently of the wave function by

*ψ*. Previous authors have fixed , such that the windowing function rotates with the wave-generating function (Prince, Cumming, & Parker, 2002). However, we have removed this constraint to allow the Gabor fitting function to describe a greater range and variety of Gabor-like components. All the Gabor functions are centered at 0 and generated over a two dimensional image and , where is the size of the component patch.

*h*and

*v*. Our equation becomes where

*s*is a scaling parameter that models the amplitude of the Gabor function. The parameters of the model were fitted to the data using the Nelder–Mead simplex method (Nelder & Mead, 1965) initialized with a genetic algorithm, using MATLAB's implementation.

*θ*) by

*π*radians is equivalent to reflecting the phase

*ϕ*about 0—i.e.,

*G*(…;θ,.,ϕ,.,.,.) =

*G*(…;θ − π,., − ϕ,.,.,.). The other parameters have been omitted here for clarity.

*ϕ*, the position shift by varying

*v*and

*h*in the direction parallel to the wave-generating function. Phase shifts can be converted to position shifts, and vice versa, by in the range . The notation indicates the magnitude of the vector, and the cos and sin terms rotate the shifts into the orientation of the wave-generating function. Equation 6 is derived from the well-known Fourier shift theory. While Equation 6 maps phase and position in the wave-generating function

*c*, phase- and position-shifted Gabor functions differ in terms of the windowing function

*w*. For example, an even-phase Gabor phase-shifted by

*π*/2 radians will become odd, but an even-phase Gabor function shifted in position by an amount equivalent to

*π*/2 radians (by Equation 6) will still be even phase.

**Figure 1**

**Figure 1**

**Table 1**

*f*is the radius (frequency) and

*θ*is the angle of the polar coordinates. The symbols

*f*

_{0}and

*f*are, respectively, the principal frequency and the bandwidth of the frequency component (Fischer et al. 2007);

_{σ}*θ*

_{0}is the principal orientation; and

*θ*is the orientation bandwidth. log-Gabor functions have some significant advantages over standard Gabor functions. The responses of Gabor functions depend on the mean luminance of the stimulus, whereas the responses of log-Gabor functions do not. log-Gabor functions also have a long tail in frequency space, which more closely matches observations in primates (Hawken & Parker, 1987). However, for this study log-Gabor functions have two significant disadvantages. Firstly, most studies which have carried out physiological measurements have fitted Gabor functions to the data, making log-Gabor functions less directly comparable to these data and to the standard binocular energy model. Secondly, log-Gabor functions do not possess a windowing function with a clearly defined center as a standard Gabor function does, rendering the analysis of position disparity more complex.

_{σ}*ρ*was 0.99986). Differences between the two measures were standardized by dividing by the mean of both log-Gabor and Gabor errors, and thus differences are specified in terms of overall fitting error. The median difference between the Gabor and log-Gabor error measurements was 0.003. This is less than the estimated level of consistency in the fitting (0.005, see previously). Of the fitted components, 43.7% exhibited standardized differences in error of less than the estimated level of consistency. For 37.3% of fitted components, the Gabor function was slightly more accurate than the log-Gabor function (median standardized error = 0.015), and for 19.0% the log-Gabor functions were slightly more accurate than the Gabor functions (median standardized error = 0.054). We concluded that log-Gabor functions were equally capable of describing the ICA components as were Gabor functions; however, they are unable to describe the position of the receptive field—which is important in our analysis—without an additional fitting stage in the spatial domain.

*s*in Equation 5) between left and right component pairs. The larger of the two values was chosen as the denominator. The resulting ratio is directionless, with a ratio of 1 being a binocular component equally weighted in each eye and a ratio of 0 being a fully monocular component with no input from the contralateral eye.

**Figure 2**

**Figure 2**

**Figure 3**

**Figure 3**

*σ*

_{w}against window height

*σ*

_{h}in terms of cycles in the wave-generating function. As the windows are rotated by

*ψ*, the values of

*σ*

_{w}and

*σ*

_{h}do not conform to the x- and y-axes; the rotation is also independent of the rotation

*θ*of the wave-generating function. The windows are biased towards oval shapes—few show circular shapes (shown on the graph as the dashed black line)—but these ovals are not generally particularly elongated. Measuring the window size in terms of cycles also provides a useful indication of the bandwidth of frequency and orientation tuning (Ringach, 2002); a low value for the window size results in a broadband frequency-tuned component and a high value results in a narrowband frequency-tuned component. Similar logic obtains for orientation tuning. The results show a strong tendency towards narrowband tuning, with values for the standard deviation of window size generally greater than 1 in one of the principal directions (either

*σ*

_{w}or

*σ*

_{h}) and generally around 0.5 in the other. As noted by Ringach (2002), this is a substantial deviation from physiology, as most cells observed in the V1 area of the macaque visual cortex have window sizes of less than 1 and are therefore much more broadly tuned in frequency and orientation than the components learned using ICA. The median frequency bandwidth of the components was 0.675 octaves (95% CI [0.673, 0.677]). The median frequency bandwidth for cells in the visual cortex of the macaque is higher than this, around 1.4 octaves (DeValois, Albrecht, & Thorell, 1982). It is worth noting, however, than as the image patches were preprocessed using PCA—which is also a bandwidth-limiting process—the narrowband tuning of the learned components is likely to be a result of band-pass filtering in the preprocessing stage.

*θ*is shown in Figure 3C. The black lines show the median of the distributions, with the 95% CIs shown as red bars. The orientations cover the range of possible values (0° to 180°), with a strong bias towards 90° and 0° (180° is equivalent to 0°). Although the distribution of edges in natural images is biased towards 0° and 90° (Hansen & Essock, 2004), it has been observed that ICA tends to produce results in which the orientation and frequency are aligned with the sampling grid (van der Schaaf & van Hateren, 1996). Consequently, we are not able to determine the extent to which these results are due to the prevalence of horizontal and vertical features in the binocular natural images or due to biases in the ICA algorithm.

*ϕ*of the fitted Gabor functions is shown in Figure 3D. Again, the medians of the distributions are shown as black lines and the 95% CIs in red. The distributions show a generally uniform distribution of phases.

*r*

^{2}= 0.99, 95% CI [0.993, 0.993],

*p*< 0.001, 95% CI [<0.001, <0.001]. The spread (median of absolute deviation) of orientation disparities is 0.0196 radians, 95% CI [0.02025, 0.02024]; the standard deviation is 0.086 radians, 95% CI [0.08122, 0.09395].

**Figure 4**

**Figure 4**

*r*

^{2}= 0.98, 95% CI [0.982, 0.984],

*p*≤ 0.001, 95% CI [<0.001, <0.001]. As before, a minority of components do not fit the linear profile and thus appear as outliers in the plot. The vast majority of components are tuned to the same frequency in each view (see Figure 5B).

**Figure 5**

**Figure 5**

*π*] in Figure 6B. The plots show a strongly bimodal distribution of phase disparity, with peaks at 0 and

*π*radians and troughs at

*π*/2. The distribution is also asymmetric with a bias toward

*π*phase components, indicating a bias in the ICA results towards antiphase components.

**Figure 6**

**Figure 6**

**Figure 7**

**Figure 7**

*r*= 0.028,

*p*< 0.001,

*n*= 37,028—and the mutual information is low (0.0846, calculated using a 2-D histogram with 1,098 bins using base 2), indicating that the distributions are independent.

**Figure 8**

**Figure 8**

**Figure 9**

**Figure 9**

*p*< 0.001; Berens, 2009). Figure 10B shows a heat map of orientation of position disparity against orientation of the components. From the heat map, no clear association is visible between position-disparity orientation and orientation of the filters; no correlation was found (using directional statistics) between position-disparity orientation and orientation of the filters,

*p*= 0.0863,

*c*= −0.0111 (Jammalamadaka & Sengupta, 2001, as implemented by Berens, 2009).

**Figure 10**

**Figure 10**

*π*.

**Figure 11**

**Figure 11**

*ϕ*

_{r}and

*ϕ*

_{l}are the phase of the left and right component Gabor functions; and

*d*

_{c}is effectively the difference in the underlying wave-generation function of both components. When measured in terms of wavelength (

*d*

_{c}

*f*), an integer value (0, 1, etc.) indicates that the peaks and troughs of the wave-generating function of the left and right components align, such that the peaks and troughs fall in exactly the same locations in the receptive fields. A

*d*

_{c}

*f*of half-integer values indicates that the wave-generating function is anticorrelated, with the peaks in one eye aligning with the troughs in the other and vice versa. Note that in both cases the windowing function is free to move, so the components can have a different configuration of sidebands as the windowing function covers/uncovers different parts of the wave-generating function.

*d*

_{c}for the fitted Gabor functions can be seen in Figure 12. Values of

*d*

_{c}are shown in terms of the both pixels and the wavelength of the individual fitted Gabor functions, and are strongly clustered around half-integer multiples of the wavelength. This fits well with the strongly correlated and anticorrelated phase results just mentioned, as correlated components would be separated by integer multiples of the wavelength and anticorrelated results would be separated by wavelengths of an integer plus 0.5. By calculating the proportion of components contained within each half-wavelength band, we found that a substantial proportion of components—35.6%, 95% CI [34.86%, 36.16%]—are tuned to zero disparity. A larger proportion are tuned to anticorrelated components: 46.65%, 95% CI [45.38%, 47.59%] are in the combined ±0.5-wavelength categories.

**Figure 12**

**Figure 12**

*π*-radian phase disparities could account for the 0 and 1/2 combined disparity peaks—but again the distribution is too narrow. Instead, the effect is produced by the interaction of phase and disparity.

**Figure 13**

**Figure 13**

**Figure 14**

**Figure 14**

**Figure 15**

**Figure 15**

**Figure 16**

**Figure 16**

*π*/2, and

*π*, corresponding to the sampling grid. ICA has a tendency to produce components aligned with the sampling grid, as these have a lower energy state than unaligned states (van der Schaaf & van Hateren, 1996). It is known that a particular tendency for horizontal and vertical orientations exists in photographic images (Hansen & Essock, 2004). While this may well to some extent reflect an anisotropic distribution of orientations in nature, it is also likely to result from the alignment of structures with the cardinal directions when composing photographs (van Hateren & van der Schaaf, 1998). It is not possible to attribute the anisotropy in our results to any corresponding anisotropy in the natural environment, since it is likely to be driven to a large degree by the sampling grid in our photographs (van Hateren & van der Schaaf, 1998). The orientations of the left and right Gabor functions of the binocular-component pairs were highly correlated (

*r*

^{2}= 0.99,

*p*≤ 0.001). This is similar to results from physiology; Bridge & Cumming (2001) reported an almost identical correlation of

*r*

^{2}= 0.985 and a spread (standard deviation) of orientation disparities of 9.22°. We observed a spread (standard deviation) of 3.55°, around half that of their result but of a similar order of magnitude. Due to the small angles involved, measurement noise could account for the discrepancy. This result supports the idea that a matching process, where features in one eye are matched with similarly oriented features in the other, is an efficient mechanism to code binocular scenes and therefore an effective strategy to compute binocular disparity.

*π*/2 are much less prevalent than disparities around 0 and

*π*, implying that these disparities have less explanatory power.

*π*radians. The peak at 0 radians indicates the detection of correlated signals in each view. As the phase disparity is partially independent of the position disparity, these correlated signals may be shifted in each view. The components around

*π*radians are anticorrelated between the left and right eyes. Their presence is consistent with Li and Atick's (1994a) decorrelated-channels theory of binocular vision. The plus (correlated) and minus (anticorrelated) channels that decorrelate single-pixel sample inputs in their research are found in the interactions between multiple pixels in the ICA models as phase differences. As noted by Bell and Sejnowski (1997) and Ringach (2002), edge-like components produce sparse coding in the monocular case, locally decorrelating the images. The appearance of anticorrelated binocular sparse components is the logical extension of this to binocular image patches. The bias towards anticorrelated binocular components has been observed before in Fourier analysis by Li and Atick (1994a), and similar anticorrelated filters were also produced in Burge and Geisler's (2014) analysis of optimal filters for disparity estimation. Burge and Geisler noted that a particular anticorrelated component could signal the presence of a stimulus at a particular disparity by

*not*responding. This is related to the idea that such cells play an inhibitory role, vetoing possible disparities when they do respond strongly (Read & Cumming, 2007). Recently, an additional role for these anticorrelated cells in distinguishing object boundaries from texture edges has been suggested by Goutcher, Hunter, and Hibbard (2013).

*π*. Also, when the preferred disparity of each component was calculated, by taking account of both its position and phase tuning, peaks in the distribution at half-wavelength intervals were evident. These unexpected results represent aspects of the components learned that do not directly reflect attributes of cortical neurons (Ringach, 2002).

*π*/2 phase shift while also pooling information across orientation, scale, and spatial position. Numerous possible combinations of components have been suggested, including multiscale phase-based models (Y. Chen & Qian, 2004), a gated model where phase maxima close to 0 are combined with position extrema (Read & Cumming, 2007), combining positive- and negative-energy model units (Haefner & Cumming, 2005), and filters based on learning the appropriate combinations from natural-image data (Burge & Geisler, 2014). Whatever combination of binocularly encoded information is required for the estimation of disparity, this is likely to occur in visual areas beyond V1.

*f*

^{ 2}power spectrum typical of natural images) and truncation of the signal to lower frequencies increases the signal-to-noise ratio. Secondly, Field (1987, 1994) has argued that in the context of uncorrelated signal noise, sparse coding may increase the signal-to-noise ratio, as neurons will respond selectively to a subset of the signal space while uniform white noise is distributed across the entire space of possible signals. However, as Hyvärinen et al. (1999) have pointed out, an ICA model trained on noisy input data will produce components tuned selectively to respond to a single (or nearly single) sample. In our work we have used bootstrapping in an attempt to assess the impact such outliers have on the distributions of components learned from ICA.

*PLoS Computational Biology*, 7 (8), e1002142, doi:10.1371/journal.pcbi.1002142.

*Journal of Neurophysiology*, 82, 874–890.

*Neural Computation*, 2 (3), 308–320.

*Neural Computation*, 4 (2), 196–210.

*Sensory Communication*, 217–234.

*Vision Research*, 37 (23), 3327–3338.

*Journal of Statistical Software*, 31 (10), 1–21.

*The Journal of Physiology*, 226 (3), 725–749.

*The Journal of Neuroscience*, 21 (18), 7293–7302.

*Journal of Vision*, 1 (3): 172, doi:10.1167/1.3.172. [Abstract].

*Theoretical aspects of neural computation: a multidisciplinary perspective: International Workshop, TANC '97*, Hong Kong, 26-28 May 1997 (pp. 225–235). Berlin: Springer-Verlag.

*Neural Computation*, 16, 1545–1577, doi:10.1162/089976604774201596.

*The Journal of Physiology*, 256 (3), 509–526.

*Vision Research*, 35 (1), 7–24.

*Nature*, 418 (6898), 633–636.

*Annual Review of Neuroscience*, 24 , 203–238.

*Nature*, 352 (6331), 156–159.

*Perception*, 24 (1), 3–31.

*Journal of Neurophysiology*, 89 (2), 1094–1111.

*Vision Research*, 22, 545–559, doi:10.1016/0042-6989(82)90113-4.

*Proceedings of the National Academy of Sciences, USA*, 103 (4), 1141–1146.

*Journal of Neurophysiology*, 88 (5), 2874–2879.

*European Journal of Neuroscience*, 15 (3), 475–486.

*PLoS ONE*, 8 (12), e80745, doi:10.1371/journal.pone.0080745.

*Neural Computation*, 6 (4), 559–601.

*Journal of the Optical Society of America A*, 4 (12), 2379–2394.

*Journal of the Optical Society of America A*, 29 (1), 55–67.

*International Journal of Computer Vision*, 75, 231–246.

*Vision Research*, 36 (12), 1839–1857.

*Vision Research*, 30 , 1661–1676.

*PLoS Computational Biology*, 9 (1), e1002873.

*NeuroReport*, 14 (6), 829–832.

*i-Perception*, 4 (7), 484–484.

*Vision Research*, 46 (18), 2901–2913.

*Neural Computation*, 21 (9), 2581–2604.

*Society for Neuroscience Abstracts, 583.9*.

*Perception*, 12 (2), 161–165.

*Proceedings of the Royal Society of London B: Biological Sciences*, 231 (1263), 251–288.

*Experimental Brain Research*, 31 (4), 523–545.

*Visual Cognition*, 15 (2), 149–165.

*Vision Research*, 48 (12), 1427–1439.

*NeuroImage*, 22 (3), 1214–1222.

*Color Research & Application*, 26 (1), 76–84.

*Seeing in depth, Vol. 1: Basic mechanisms*. Toronto, Canada: University of Toronto Press.

*Seeing in depth, Vol. 2: Depth perception*. Toronto, Canada: University of Toronto Press.

*Network: Computation in Neural Systems*, 11 (3), 191–210.

*The Journal of Physiology*, 160, 106–154.

*Neural Computation*, 15 (3), 663–691.

*IEEE Transactions on Neural Networks*, 10 (3), 626–634.

*Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences*, 371 (1984), 1–19.

*Natural image statistics: A probabilistic approach to early computational vision*(Vol. 39). New York: Springer-Verlag.

*Proceedings of the International Workshop on Independent Component Analysis and Signal Separation (ICA'99)*(pp. 425–429). Aussois, France.

*Topics in circular statistics*. River Edge, N.J.: World Scientific.

*Zeitschrift für Naturforschung*, 36 , 910–912.

*Visual Neuroscience*, 1 (4), 395–414.

*Network: Computation in Neural Systems*, 5 (2), 157–174.

*Neural Computation*, 6 (1), 127–146.

*Advances in Neural Information Processing Systems*, 19, 945–952.

*Current Biology*, 22 , 28–32.

*The Computer Journal*, 7 (4), 308–313.

*Journal of Neurophysiology*, 40 (2), 260–283.

*Journal of Neurophysiology*, 93 (4), 1823–1826.

*Journal of Neurophysiology*, 83 (5), 2967–2979.

*Experimental Brain Research*, 6 (4), 353–372.

*Perception*, 14 (3), 305–314.

*Science*, 249 (4972), 1037–1041.

*Trends in Neurosciences*, 19 (9), 386.

*Journal of Neurophysiology*, 77 (6), 2879–2909.

*Neural Networks*, 17 (7), 953–962, doi:10.1016/j.neunet.2004.02.004.

*Probabilistic models of the brain: Perception and neural function*(pp. 257–272). Cambridge, MA: MIT Press.

*Network: Computation in Neural Systems*, 7 (2), 333–339.

*Nature Reviews Neuroscience*, 8 (5), 379–391.

*Scientific American*, 227 (2), 84–95.

*Tyto alba*).

*Science*, 193 (4254), 675–678.

*Vision and visual dysfunction: binocular vision and psychophysics*( Vol. 9, pp. 227–238). London: Macmillon Press.

*Journal of Neurophysiology*, 40 (6), 1392–1405.

*The Journal of Neuroscience*, 8 (12), 4531–4550.

*Vision Research*, 25 (3), 397–406.

*The Journal of Physiology*, 315 (1), 469–492.

*Vision Research*, 39 (23), 3934–3950.

*Journal of Neurophysiology*, 87 (1), 209–221.

*Journal of Neurophysiology*, 87 (1), 191–208.

*Journal of Light & Visual Environment*, 30 (1), 29–33.

*Journal of Neurophysiology*, 90, 2795–2817.

*Journal of Neurophysiology*, 91, 1271–1281.

*Neural Computation*, 16 (10), 1983–2020.

*Nature Neuroscience*, 10 (10), 1322–1328.

*Journal of Neurophysiology*, 88 , 455–463.

*Vision Research*, 37 (17), 2455–2464.

*The Journal of Neuroscience*, 27 (44), 11820–11831.

*Proceedings of the IRE*, 37 (1), 10–21.

*Annual Review of Neuroscience*, 24, 1193–1216, doi:10.1146/ANNUREV.NEURO.24.1.1193.

*Proceedings of the Royal Society of London B: Biological Sciences*, 216 (1205), 427–459.

*Visual Neuroscience*, 8 (6), 557–566.

*Neuron*, 38 (1), 103–114. doi:10.1016/S0896-6273(03)00150-8.

*Vision Research*, 19 (8), 859–865.

*Vision Research*, 36 (17), 2759–2770.

*Proceedings of the Royal Society of London B: Biological Sciences*, 265 (1394), 359–366.

*Ophthalmic and Physiological Optics*, 12 (2), 269–272.

*Neural Computation*, 8 (1), 129–151.

*Understanding vision: Theory, models, and data*. Oxford, UK: Oxford University Press.

*Advances in Neural Information Processing Systems*(pp. 1736–1744).

^{1}Binocular photographic image data and (MATLAB) source code associated with this publication are available at https://github.com/DavidWilliamHunter/Bivis.