Our first experiment suggested that the timings of perceptual changes in distinct multi-stable stimuli were independent of one another (see
Figures 2 and
3). However, they also suggested that the timings of changes in identical, but spatially offset, multi-stable stimuli were related (see
Figure 1).
In our second experiment, we explored this last observation in greater detail. We showed that synchronized changes in identical, but spatially offset, multi-stable stimuli could not be solely attributed to the effects of facilitation along collinear contours (see
Figures 4a–
4b and
5a–
5b). Moreover, we found that a critical factor, leading to synchronized perceptual changes, was that identical images be encoded within the same monocular channels (see
Figures 4 and
5).
In general, our data are not consistent with the timing of perceptual changes in different multi-stable stimuli being determined by a common high-level process. Moreover, they add to the body of evidence concerning the importance of monocular channel interactions for BR (Arnold, James, & Roseboom,
2009; Kang, Heeger, & Blake,
2009; van Boxtel, Alais, & van Ee,
2008; van Boxtel, Knapen, Erkelens, & van Ee,
2008; Watson et al.,
2004).
The fact that perceptual switches, between face and house dominance, only tended to synchronize when identical images were encoded in the same monocular channels (see
Figure 5) is particularly striking. These data suggest that the dominance of even complex images, known to induce activity in higher levels of the human visual system (Kanwisher, McDermott, & Chun,
1997), involves monocular channel specific interactions. These data are reminiscent of results obtained using biological motion stimuli (Watson et al.,
2004). In that context, it was found that motion grouping could induce sets of different colored dots to rival, but this only happened when the sets of dots that defined the biological movement were encoded in the same monocular channels (Watson et al.,
2004).
We found no evidence for synchronized perceptual changes when similar images (faces, houses, or simple oriented images) were encoded by different monocular channels. This does not preclude the possibility of such an effect being observed. However, our data suggest that the encoding of similar inputs within common monocular channels must be a stronger determinant of synchronized perceptual changes than is similar appearance per se.
Perhaps the most surprising aspect of our data is that it suggests the existence of processes that enhance the probability of seeing similar images, be they faces or oriented input, when these images are encoded within a common monocular channel. Pattern-based interactions at this level of processing could enhance visibility in cluttered settings, making it easier to detect a continuation of a pattern behind monocular occlusions (Arnold et al.,
2007,
2008; Changizi & Shimojo,
2008). Given the efficacy of facial images in this context, it is tempting to conclude that these pattern-based monocular interactions involve feedback, from binocular face coding mechanisms to monocular V1 channels. However, we cannot exclude an alternate possibility that these interactions occur within the monocular layers of V1, based on matched orientations and frequency spectra.
Our data are less consistent with the results of another study that looked at the grouping of similar percepts across distinct multi-stable stimuli (Pearson & Clifford,
2005). In that study, perceptual rivalry between pairs of orthogonal gratings was induced via three distinct presentation protocols, resulting in what are commonly referred to as monocular (Mackey,
1960), binocular (Helmholtz,
1962), and stimulus (Logothetis et al.,
1996) rivalries. Despite distinct presentation protocols in different stimulus regions, similarly oriented and colored gratings tended to dominate across the entire display (Pearson & Clifford,
2005).
There are at least two important differences between our study and the former study that suggested perceptual grouping across different forms of perceptual rivalry (Pearson & Clifford,
2005). First, we have focused on the timing of perceptual changes. If the dominance of a percept in one multi-stable stimulus prompted dominance of a similar percept in another stimulus section after a variable delay, we might not have detected the relationship. Note, however, that we would have detected any tendency to report just one perceptual change per 1-s epoch. This would have resulted in a lower proportion of synchronous changes being reported than predicted by chance.
A second critical difference is that the former study only used stimuli containing collinear contours, which can lead to interactions that mutually enhance contrast detection sensitivity (Das & Gilbert,
1995; Field et al.,
1993) and the probability of synchronous perceptual dominance during binocular rivalry (Alais & Blake,
1999; Alais et al.,
2006). While we used stimuli containing collinear contours in the Simple BR condition of
Experiment 1 (see
Figure 1), we avoided doing so in the Matched Simple BR condition of
Experiment 2 (see
Figures 4a and
4b). The latter condition resulted in proportionally fewer reports of synchronous changes in spatially offset stimuli (0.13 ± 0.04 as opposed to 0.25 ± 0.02), suggesting that collinear contour interactions
do encourage synchronized perceptual changes (see also Alais et al.,
2006). It is possible that such interactions were primarily responsible for driving grouping across different multi-stable stimuli in the former study (Pearson & Clifford,
2005).
According to this last suggestion, the synchronization of perceptual changes across distinct multi-stable stimuli may reflect the degree to which interactions can induce synchronous changes in signal strength. Contrasting very similar perceptual representations, across distinct presentation protocols, might maximize the probability of this happening (Pearson & Clifford,
2005). This need not imply that the critical modulation is driven by higher level processes. Collinear facilitation could, for instance, be driven by gain modulations of orientation-tuned V1 cells.
The contemporary consensus regarding the neural substrates of BR is that it is multi-faceted, with changes driven by activity in multiple structures located at different levels of the visual hierarchy (Blake & Logothetis,
2002; Haynes et al.,
2005; Lee & Blake,
1999; Tong & Engel,
2001; Watson et al.,
2004; Wunderlich et al.,
2005). While high-level grouping processes undoubtedly shape BR (Alais & Melcher,
2007; Dorrenhaus,
1975), in our view a question remains concerning whether they do so because they are directly responsible for determining perceptual dominance, or because they indirectly modulate activity at a critical low-level site via feedback. Already evidence concerning face adaptation has linked at least some of the efficacy of complex images to feedback (van Boxtel, Alais et al.,
2008). More recently, members from our laboratory have shown that propagation along monocular channels is integral to the spread of triggered dominance changes through facial images (Arnold et al.,
2009). These and other situations, wherein stimuli encoded by high-level processes engage in monocular channel-specific interactions (
Experiment 2, see also van Boxtel, Knapen et al.,
2008; Watson et al.,
2004), suggest that the efficacy of high-level BR coding effects may rely on feedback to monocular levels of processing.