Most events in our surroundings can be perceived through multiple sensory modalities. An approaching train, for instance, can be seen, heard, and possibly even felt through vibrations of the ground. These different sensory signals have to be combined into one coherent multimodal percept of the environment. An important cue that is thought to aid the correct combination of information across separate sensory channels is perceived synchrony (Spence & Squire,
2003). In particular, it has been shown that multimodal interactions are strongest when stimuli are presented simultaneously or close in time (within a temporal window of about 100 ms) and get weaker as the temporal discrepancy between stimuli increases (Bresciani, Dammeier, & Ernst,
2006; Fendrich & Corballis,
2001; Morein-Zamir, Soto-Faraco, & Kingstone,
2003; Shams, Kamitani, & Shimojo,
2002). As is most evident in the case of audio-visual stimuli, perceived synchrony and physical synchrony differ considerably: light and sound stimuli not only differ in their physical propagation velocity, but they also have different transduction and processing times (see, i.e., Allison, Matsumiya, Goff, & Goff,
1977; King,
2005; Spence & Squire,
2003). This difference is captured by measuring the asynchrony necessary to perceive two stimuli to be synchronous, defined as the Point of Subjective Simultaneity (PSS). It has been shown that the PSS for multimodal stimuli can differ significantly depending on the conditions of stimulation (i.e., Zampini, Shore, & Spence,
2003) and it is also affected by adaptation. Evidence for such a flexibility has recently been provided by a host of studies reporting that the perception of temporal order and simultaneity can be recalibrated for audio-visual (Fujisaki, Shimojo, Kashino, & Nishida,
2004; Hanson, Heron, & Whitaker,
2008; Harrar & Harris,
2005,
2008; Heron, Whitaker, McGraw, & Horoshenkov,
2007; Vatakis, Navarra, Soto-Faraco, & Spence,
2007; Vroomen, Keetels, de Gelder, & Bertelson,
2004), audio-tactile (Hanson et al.,
2008), and visual-tactile (Hanson et al.,
2008; Keetels & Vroomen,
2008; Takahashi, Saiki, & Watanabe,
2008) stimuli. The basic paradigm in these studies is to measure observers' perception of cross-modal simultaneity before and after they have been exposed for several minutes to a constant temporal discrepancy between the stimuli in two modalities. The repeated presentation of the same stimulus pair might increase the tendency of the participants to perceive the visual and the auditory signals to be related. In other words, the repeated exposure increases the probability that the two signals were in fact produced by the same external event (“unity assumption,” Welch & Warren,
1980) and were therefore produced in synchrony. In this case, the asynchrony between the signals is likely due to some temporal bias (i.e., the difference in propagation velocity) and its effect should be reduced to a minimum. Consistently, results indicate that after exposure to asynchronous stimuli perception of simultaneity changes such that the asynchrony is perceived to be less. In other words, observers tend to perceptually realign the asynchronous stimuli during exposure. It is not clear, however, how such a recalibration is achieved.