There has been extensive study of the influence of visual context on the perception of a visual target. In the main, these studies have concerned the
spatial properties of the target. For example, physiological, computational, and psychophysical studies have delineated the ways in which visual context affects the detectability and the perceived position, orientation, color, or brightness of a target (Levitt & Lund,
1997; Polat, Mizobe, Pettet, Kasamatsu, & Norcia,
1998; Sengpiel, Sen, & Blakemore,
1997; Somers et al.,
1998; Stemmler, Usher, & Niebur,
1995). However, much less is known about how the perceived
temporal properties of a target can be affected by the visual context.
In the cross-modal domain, a number of studies have identified ways in which auditory stimuli can impact the number of times a visual target is seen to have flashed. A single transient flash accompanied by multiple beeps is often perceived as multiple flashes (Shams, Kamitani, & Shimojo,
2000); similarly, the temporal rate of a series of flashes is perceptually sped up by a series of beeps played at a higher rate (Gebhard & Mowbray,
1959; Myers, Cotton, & Hilp,
1981; Regan & Spekreijse,
1977; Shipley,
1964; Welch, DuttonHurt, & Warren,
1986). Given that such effects can be generated when target and context (inducer) are of different modalities, it ought to be possible to achieve a similar effect when target and inducer are of the same modality. After all, connectivity within the visual cortex is richer than the connectivity between the largely segregated auditory and visual systems.
Such a long-range visual–visual interaction has been described (Wilson & Singer,
1981) between visual stimuli positioned as far as 20° apart. Observers were asked to report whether the target disk had been presented in a single steady flash or if it had flickered (flashed twice). They made significantly more errors when the number of times the target and distracter disks were flashed did not match. In other words, observers were more likely to see a single flash as flickering if the distracter flashed twice, and they were more likely to see two flashes as a single steady flash if the distracter flashed once. There have been few studies following up on this finding (Leonards & Singer,
1997; Wilson,
1987), and thus the parametric properties and scope of the effect remain largely unknown. In particular, this intramodal interaction has not been examined in light of the theoretical and neurobiological implications raised by the more recent cross-modal studies.
The effect found by Wilson and Singer was framed as variations of a target feature, namely, the presence or absence of flicker. Meanwhile, the multisensory experiments have generally been concerned with how the target stimulus is segmented into perceptual tokens, namely, as individual flashes. We adopt this flash-counting task to test the case when target and inducer are of same modality, i.e., vision. This allows us to directly compare the within- and cross-modal effects. Furthermore, the task offers us a more graded report of the phenomenon, allowing us to directly examine perceptual effect strength in the face of various stimulus manipulations.
Investigations of the cross-modal illusion have yielded theoretical insights by offering cue integration models embedded in a Bayesian framework. In this framework, information from independent sensory channels is integrated in a Bayesian near-optimal manner (Alais & Burr,
2004), so that the final outcome depends on the weighted sum of the reliability of the information from the independent information sources (cue independence). Certain asymmetries in the data obtained by Wilson suggest that the visual–visual effect may not fit the model of cue independence; however, a pair of studies yielded opposite directions of effect (Wilson,
1987; Wilson & Singer,
1981). Here, using a graded report, the present study reconciles the findings and identifies models that have the promise to account for the visual–visual effect.