Meese and Summers (
2009) performed similar experiments to those performed here but restricted the investigation to threshold (i.e., pedestal contrast = 0%). In that study, we concluded that binocular summation precedes spatial summation in the system hierarchy (
Figure 6; see also Mansouri et al.,
2005). Luminance contrasts in common spatial frequency and orientation bands are summed across eyes from corresponding retinal points to produce a binocular image. For the results there and here, our model proposes that this is followed by widespread integration of contrast across area. Further work is needed, but a system that does this with complete disregard for underlying image statistics or structures seems unlikely. A more plausible visual heuristic is to integrate over spatial regions for which the local analysis is similar, or for which the carrier and/or modulator change only smoothly in the binocular image. An adaptive or matched filtering process such as this might involve local comparisons of image structure (Field, Hayes, & Hess,
1993; Kingdom, Prins, & Hayes,
2003; Levi, Klein, & Chen,
2005; Saarinen, Levi, & Shen,
1997) and spectrum (Abbey & Eckstein,
2007; Georgeson & Meese,
1997,
1999; Meese & Georgeson,
2005) under the control of second-order binocular mechanisms (Dakin & Mareschal,
2000; Graham & Sutter,
1998) that assess the envelope (Georgeson & Schofield,
2002) and the carrier (Kingdom et al.,
2003; Motoyoshi & Kingdom,
2007; Motoyoshi & Nishida,
2004) to search for boundary cues (Grigorescu, Petkov, & Westenberg,
2004; Kawabe & Miura,
2004; Sillito, Grieve, Jones, Cudeiro, & Davis,
1995) or other Gestalt-like cues (Sayim, Westheimer, & Herzog,
2010), plausibly in V2 (Anzai, Peng, & Van Essen,
2007; Mareschal & Baker,
1998), with integration taking place at a later stage, plausibly V4 (Arcizet, Jouffrais, & Girard,
2008; Desimone & Schein,
1987; Pollen et al.,
2002) or IT/LO (Köteles, De Mazière, Van Hulle, Orban, & Vogels,
2008; Ostwald, Lam, Li, & Kourtzi,
2008). For stimuli such as those found here, where the carrier is constant and the contrast modulation is not detected at the detection threshold of the target (Meese & Summers,
2007), this heuristic would demand blanket integration over the carrier, consistent with our results. This is equivalent to constructing a (phase-insensitive) template that is matched to the sum of the two stimulus components in
Figure 2 by summing lower order (V1-like) filter elements (Rovamo et al.,
1993; Watson & Ahumada,
2005). This type of neuronal convergence could be a plausible first step to solving the binding problem for spatially extensive textures (Arcizet et al.,
2008; Cant, Arnott, & Goodale,
2009; Graham & Sutter,
1998; Köteles et al.,
2008; Roach, Webb, & McGraw,
2008; Webb, Roach, & Peirce,
2008), depth gradients (Meese & Holmes,
2004; Summers & Meese,
2006), and other smooth image structures (May & Hess,
2007a,
2007b). Of course, under normal viewing conditions, images of the natural world tend to have more similarities than differences between the two eyes, and so perhaps it is not surprising that eye of origin is largely irrelevant within the scheme that we have proposed (
Figure 6).