**Abstract**:

**Abstract**
**Stereo vision has a well-known anisotropy: At low frequencies, horizontally oriented sinusoidal depth corrugations are easier to detect than vertically oriented corrugations (both defined by horizontal disparities). Previously, Serrano-Pedraza and Read (2010) suggested that this stereo anisotropy may arise because the stereo system uses multiple spatial-frequency disparity channels for detecting horizontally oriented modulations but only one for vertically oriented modulations. Here, we tested this hypothesis using the critical-band masking paradigm. In the first experiment, we measured disparity thresholds for horizontal and vertical sinusoids near the peak of the disparity sensitivity function (0.4 cycles/°), in the presence of either broadband or notched noise. We fitted the power-masking model to our results assuming a channel centered on 0.4 cycles/°. The estimated channel bandwidths were 2.95 octaves for horizontal and 2.62 octaves for vertical corrugations. In our second experiment we measured disparity thresholds for horizontal and vertical sinusoids of 0.1 cycles/° in the presence of band-pass noise centered on 0.4 cycles/° with a bandwidth of 0.5 octaves. This mask had only a small effect on the disparity thresholds, for either horizontal or vertical corrugations. We simulated the detection thresholds using the power-masking model with the parameters obtained in the first experiment and assuming either single-channel and multiple-channel detection. The multiple-channel model predicted the thresholds much better for both horizontal and vertical corrugations. We conclude that the human stereo system must contain multiple independent disparity channels for detecting horizontally oriented and vertically oriented depth modulations.**

*ρ*= 29.81 dots/°

^{2}, giving a Nyquist limit of

*f*= 2.73 cycles/°

_{N}*f*= 0.5

_{N}^{2}, reduced to 2.8 cd/m

^{2}when viewed through the polarizing glasses; the black background had a luminance of 0.07 cd/m

^{2}, reduced to 0.05 cd/m

^{2}. To minimize vergence movements, the subjects were instructed to maintain fixation on a small cross (0.3° × 0.3°) in the center of the screen, flanked by vertical and horizontal Nonius lines of length 0.6°, presented in between stimuli. In each experiment, a new trial was initiated after the participant's response, thus the experiments proceeded at a pace determined by the observer. No feedback about correctness on individual trials was given.

*N*

_{0}∈ {1.625, 6.5, 25, 100, 400} × 10

^{−3}(cycles/°)

^{−1}.

*S*= 0.5, 1, 2, 3, and 4 octaves; notches were symmetrical in log-frequency. Thus for a 1-octave notch, the noise amplitude spectrum was zero between 0.28 and 0.57 cycles/°. For the notch stimuli, the noise level outside the notch was always

_{oct}*N*

_{0}= 25 × 10

^{−3}(cycles/°)

^{−1}, independent of the bandwidth of the notch.

*N*

_{0}= 25 × 10

^{−3}(cycles/°)

^{−1}.

*n*(

*x*,

*y*) by extending the 1D disparity noise in the orthogonal direction.

*V*=

_{RMS}*N*

_{0}is the noise level and

*W*is the bandwidth of the noise that had energy (e.g., for our broadband noise

*W*was 2.46 cycles/°). Second, we normalized the noise sample to peak at one. Third, in order to present the sample of disparity noise with the desired

*V*, we calculated the peak amplitude disparity (

_{RMS}*A*) using the following equation: where

*a*= {[∑∑

*n*

^{2}(

*x*,

*y*)]/

*MN*},

*b*= {[∑∑

*n*(

*x*,

*y*)]/

*MN*}, and

*MN*is the number of columns (

*M*) × rows (

*N*) of the noise sample (800 × 800 pixels in the experiments). Finally, we multiplied the sample of disparity noise (

*n*(

*x*,

*y*)) by the desired disparity amplitude (

*A*), and added the result to the signal in order to obtain the desired disparity map (see Figures 1b and 1f). Note that V

*is the standard deviation of the result of multiplying*

_{RMS}*A*×

*n*(

*x*,

*y*) (Davenport & Root, 1958).

- (a) Stimuli are processed by a bank of separate, independent, and overlapping band-pass linear channels, each tuned to a different spatial frequency
*ξ*; - (b) A channel
*k*detects a signal when its power signal-to-noise ratio reaches some fixed threshold θ; and - (c) A channel's sensitivity is limited by its internal noise
*N*(*ξ*)._{k}

*s*= 1/θ, while

*N*(

*ξ*) sets the relative sensitivity between channels. The channel's power response to a grating of disparity amplitude

*m*at spatial frequency

*u*

_{0}is {[

*m*

^{2}(

*u*

_{0})]/2}|

*H*(

*u*

_{0};

*ξ*)|

_{k}^{2}, where

*H*(

*u*,

*ξ*) is the channel's modulation transfer function (see Equation 4), normalized such that

*H*(

*ξ*,

_{k}*ξ*) = 1. The channel's power response to 1D noise with power spectrum

_{k}*ρ*(u) is 2

*m*(

_{T}*u*

_{0}) necessary for the channel

*k*to detect a signal at

*u*

_{0}in the presence of 1D noise is In the absence of external noise, the disparity threshold

*m*

_{0}at any frequency is set by the internal noise of the channel detecting it. A channel's internal noise can therefore be deduced from the sensitivity at the channel's preferred frequency: Substituting this expression for internal noise into Equation 2, we obtain the fundamental masking equation: which relates the increased threshold needed to detect a masked signal to the unmasked threshold and, critically, the channel modulation transfer function (MTF) (Serrano-Pedraza & Sierra-Vázquez, 2006; Serrano-Pedraza, Sierra-Vázquez, et al., 2013a). Equation 3 was first derived, with minor differences, for luminance channels by Solomon (2000, see his equations 4 and 5).

*ξ*,

_{i}*ξ*≠ 0, is the peak spatial frequency of the disparity channel

_{i}*i*and

*α*,

_{i}*α*> 0, is an index of its spatial spread. The relative bandwidth (full width at half height), in octaves, is obtained from

_{i}*B*(

_{oct}*ξ*) = (2

_{i}*α*. We have chosen this MTF because: (a) we know the analytic solution of its integral, which will be useful for fitting the power-spectrum model (see Appendix, Equation A1); and (b) it has a symmetric shape when represented in log scale, similar to the adapting thresholds curves found with disparity gratings (Schumer & Ganz, 1979).

_{i}*u*

_{0}are always detected by the disparity channel which is tuned most closely to the spatial frequency of the signal. We know that there must be a channel tuned to 0.4 cycles/°, since that is the peak of the disparity sensitivity function (Rogers & Graham, 1982; Bradshaw & Rogers, 1999; Serrano-Pedraza & Read, 2010). We can therefore assume that

*ξ*=

_{k}*u*

_{0},

*u*

_{0}= 0.4 cycles/° (see Appendix and Equations A2 and A3).

*u*

_{0}= 0.1 cycles/°. For the single-channel hypothesis, of course, there is only a single channel, tuned to

*ξ*= 0.4 cycles/° (see Appendix and Equation A4). For the multiple-channel hypothesis, we need to consider which channel would detect the stimulus. With band-pass noise, off-frequency looking becomes possible: the stimulus can be detected by a disparity channel tuned to a spatial frequency different from that of the signal. Serrano-Pedraza et al. (2013a) showed that when the center spatial frequency of the band-pass noise is situated more than 2 octaves of distance from the spatial frequency of the signal then there is almost no off-frequency looking. Here, assuming that all disparity channels of a given orientation have the same bandwidth (Schumer & Ganz, 1979) and taking this to be the value obtained in Experiment 1, we have found by simulation, following the procedure described in Serrano-Pedraza et al. (2013a), that the channel with the highest signal to noise ratio for detection of the signal is tuned to 0.09 cycles/°. Since this is so close to 0.1 cycles/°, for the multiple disparity channels hypothesis we have assumed that the channel that detects the signal is tuned to

_{k}*ξ*= 0.1 cycles/°.

_{k}*α*and

_{i}*s*. Parameter

*α*controls the bandwidth (

_{i}*B*, in octaves) of the disparity channel and

_{oct}*s*its sensitivity. For each subject we fitted Equations A2 and A3 to the data, where the values of the parameters

*α*and

_{i}*s*were estimated using a least-squares fitting procedure. The sum of the squared errors between the empirical squared-disparity thresholds and the model squared-disparity thresholds was minimized using the Matlab routine “fminsearch” that uses the Nelder-Mead simplex search method (Nelder & Mead, 1965). The goodness of fit was calculated by means of the coefficient of determination (

*R*

^{2}) between the model predictions and all empirical masking thresholds (broadband and notched noise masking data) (see red line in Figures 2, 3, and 4).

*α*and

_{i}*s*obtained from Experiment 1 for vertical and horizontal corrugations (see Figure 4). The predictions were the detection thresholds for detecting a sinusoidal corrugation of 0.1 cycles/° (vertical and horizontal corrugations), either unmasked or masked by band-pass noise centered at 0.4 cycles/° (0.5 octaves width) (see example in Figures 1d and 1h). To obtain predictions under the two hypotheses, we run the model (see Appendix and Equation A4) assuming that the signal is detected by a disparity channel centered at spatial frequency of either 0.1 cycles/° (multiple channel hypothesis) or 0.4 cycles/° (single channel hypothesis).

*R*

^{2}) between all masking thresholds and the model prediction is specified in the top panels of Figures 2 and 3. For horizontal corrugations (Figure 2), the estimated bandwidths ranged from 2.4 to 3.2 octaves in our four subjects, with a mean of 2.9 octaves. For vertical corrugations, estimated bandwidths ranged from 1.7 to 4.6 octaves, with a mean of 2.7 octaves (see Figure 3).

*SD*of four subjects) for the sinusoidal depth corrugation without masking. The average of the disparity thresholds for horizontal corrugations was 11.5 arcsec, and for vertical corrugations was 17.24 arcsec (vertical/horizontal [V/H] ratio is 1.5). This is similar to previous estimates of the relatively weak stereo anisotropy at this frequency (Bradshaw & Rogers, 1999; Serrano-Pedraza & Read, 2010).

*Vision Research*

*,*43

*,*165–170. [CrossRef] [PubMed]

*Vision Research*

*,*38

*,*267–280. [CrossRef] [PubMed]

*Vision Research*

*,*46 (17), 2636–2644. [CrossRef] [PubMed]

*Vision Research*, 39 (18), 3049–3056. [CrossRef] [PubMed]

*Spatial Vision*, 10 (4), 433–436. [CrossRef] [PubMed]

*Vision Research*

*,*33

*,*2189–2201. [CrossRef] [PubMed]

*Vision Research*

*,*34 (5), 607–620. [CrossRef] [PubMed]

*Perception & Psychophysics*

*,*39

*,*151–153. [CrossRef] [PubMed]

*Vision Research*

*,*38

*,*1861–1881. [CrossRef] [PubMed]

*Signal detection theory and psychophysics*(Reprint with corrections of the original 1966 ed.). Huntington, NY: Robert E. Krieger Publishing Co.

*Perception*, 21 (4), 427–439. [CrossRef] [PubMed]

*Journal of Experimental Psychology-Human Perception and Performance*, 28 (2), 469–476. [CrossRef] [PubMed]

*Vision Research*

*,*34

*,*885–912. [CrossRef] [PubMed]

*Perception*

*,*36 (ECVP Abstract Supplement).

*Journal of the Optical Society of America A*

*,*12

*,*250–260. [CrossRef]

*Vision Research*

*,*42

*,*1165–1184. [CrossRef] [PubMed]

*Vision Research*

*,*30 (11), 1781–1791. [CrossRef] [PubMed]

*An introduction to the psychology of hearing*. New York: Academic Press.

*Proceedings of the Royal Society of London B*

*,*235, 221–245. [CrossRef]

*Vision Research*

*,*39

*,*721–731. [CrossRef] [PubMed]

*Computer Journal*

*,*7

*,*308–313. [CrossRef]

*Journal of the Acoustical Society of America*

*,*59

*,*640–654. [CrossRef] [PubMed]

*Frequency selectivity in hearing*(pp. 123–177). New York: Academic Press.

*Journal of the Acoustical Society of America*

*,*67

*,*229–245. [CrossRef] [PubMed]

*Effects of visual noise*. Unpublished doctoral dissertation, Cambridge University.

*Spatial Vision*

*,*10 (4), 437–442. [CrossRef] [PubMed]

*Perception & Psychophysics*

*,*28

*,*377–379. [CrossRef] [PubMed]

*Vision Research*

*,*31 (6), 1053–1065. [CrossRef] [PubMed]

*Vision Research*

*,*22

*,*261–270. [CrossRef] [PubMed]

*Vision Research*

*,*19

*,*1303–1314. [CrossRef] [PubMed]

*Journal of Vision*.

*Journal of Vision*

*,*9 (4): 3, 1–13, http://www.journalofvision.org/content/9/4/3, doi:10.1167/9.4.3. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Vision*

*,*10 (12): 10, 1–11, http://www.journalofvision.org/content/10/12/10, doi:10.1167/10.12.10. [PubMed] [Article] [CrossRef] [PubMed]

*Spanish Journal of Psychology*, 9 (2), 249–262. [CrossRef] [PubMed]

*Journal of the Optical Society of America A*

*,*30 (6), 1119–1135. [CrossRef]

*Journal of the Optical Society of America A*

*,*17

*,*986–993. [CrossRef]

*Nature*

*,*369

*,*395–397. [CrossRef] [PubMed]

*Journal of Vision*

*,*4 (1): 3, 22–31, http://www.journalofvision.org/content/4/1/3, doi:10.1167/4.1.3. [PubMed] [Article] [CrossRef]

*Vision Research*

*,*35 (17), 2503–2522. [CrossRef] [PubMed]

*Vergence eye movements: Basic and clinical aspects of binocular*(pp. 199–195). London, UK: Butterworths.

*Vision and visual dysfunction*

*,*Vol 9,

*Binocular vision*(pp. 38–74). London, UK: Macmillan.

*Journal of Vision*

*,*10 (1): 10, 1–11, http://www.journalofvision.org/content/10/1/10, doi:10.1167/10.1.10. [PubMed] [CrossRef] [PubMed]

*Vision Research*

*,*21

*,*1115–1122. [CrossRef] [PubMed]

*Vision Research*

*,*5 (81), 58–68. [CrossRef]

*Vision Research*

*,*13 (87), 10–21. [CrossRef]

*u*(low spatial frequency) and

_{lo}*u*(high spatial frequency). This solution is useful in order to solve the integral of the model when we multiply the MTF of the disparity channel by the power spectrum

_{hi}*ρ*of the mask (see Equation 3). The MTF is even symmetry (

*H*(

*u*,

*ξ*) =

*H*(−

*u*,

*ξ*)) so we need only evaluate the positive half of the integral. where

*u*> 0,

*ξ*> 0 (

_{i}*ξ*corresponds to the peak of the MTF),

_{i}*u*>

_{hi}*u*≥ 0,

_{lo}*α*> 0, (index of the spatial spread of the MTF of the disparity channel

_{i}*i*), and erf(

*x*) is the error function: erf(

*x*) = (2 /

*ρ*(

*u*) =

*N*

_{0}. As explained in the text, we can assume that the signal is detected by the disparity channel tuned to the signal (i.e.,

*ξ*=

_{k}*u*

_{0}, |

*H*(

*u*

_{0};

*ξ*)|

_{k}^{2}= 1), therefore, the equation of the model (see Equation 3) used in the fitting is as follows: where we used

*u*= 0.04 cycles/°,

_{lo}*u*= 2.5 cycles/°, and five power spectral density or noise levels,

_{hi}*N*

_{0}∈ {1.625, 6.5, 25, 100, 400} × 10

^{−3}(cycles/°)

^{−1}, and

*u*

_{0}= 0.4 cycles/°. The integral is solved in Equation A1.

*ξ*=

_{k}*u*

_{0}, |

*H*(

*u*

_{0};

*ξ*)|

_{k}^{2}= 1. Therefore, the equation of the model (see Equation 3) used in the fitting is as follows: where

*u*is the cut-off frequency of the low-pass component and

_{lo}*u*is the cut-off frequency of the high-pass component, and

_{hi}*u*

_{0}= 0.4 cycles/°,

*u*=

_{lo}*u*

_{0}

_{,}and

*u*=

_{hi}*u*

_{0}

*S*is the spectral notch size in octaves. The power spectral density of the notched noise was

_{oct}*N*

_{0}= 25 × 10

^{−3}(cycles/°)

^{−1}.

*u*

_{0}= 0.1 cycles/°,

*u*= 0.336 cycles/°,

_{lo}*u*= 0.475 cycles/°. The power spectral density of the band-pass noise was

_{hi}*N*

_{0}= 25 × 10

^{−3}(cycles/°)

^{−1}. For the multiple channel prediction, we used

*ξ*= 0.1 cycles/° and for the single channel prediction we used

_{k}*ξ*= 0.4 cycles/°, as explained in the text.

_{k}