Open Access
Article  |   January 2025
Visual information shows dominance in determining the magnitude of intentional binding for audiovisual outcomes
Author Affiliations
  • De-Wei Dai
    Department of Psychology, National Taiwan University, Taipei City, Taiwan
    [email protected]
  • Po-Jang (Brown) Hsieh
    Department of Psychology, National Taiwan University, Taipei City, Taiwan
    [email protected]
Journal of Vision January 2025, Vol.25, 7. doi:https://doi.org/10.1167/jov.25.1.7
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      De-Wei Dai, Po-Jang (Brown) Hsieh; Visual information shows dominance in determining the magnitude of intentional binding for audiovisual outcomes. Journal of Vision 2025;25(1):7. https://doi.org/10.1167/jov.25.1.7.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Intentional binding (IB) refers to the compression of subjective timing between a voluntary action and its outcome. In this study, we investigate the IB of a multimodal (audiovisual) outcome. We used a modified Libet clock while depicting a dynamic physical event (collision). Experiment 1 examined whether IB for the unimodal (auditory) event could be generalized to the multimodal (audiovisual) event, compared their magnitudes, and assessed whether the level of integration between modalities could affect IB. Planned contrasts (n = 42) showed significant IB effects for all types of events; the magnitude of IB was significantly weaker in both audiovisual integrated and audiovisual irrelevant conditions compared with auditory, with no difference between the integrated and irrelevant conditions. Experiment 2 separated the components of the audiovisual event to test the appropriate model describing the magnitude of IB in multimodal contexts. Planned contrasts (n = 42) showed the magnitude of IB was significantly weaker in both the audiovisual and visual conditions compared with the auditory condition, with no difference between the audiovisual and visual conditions. Additional Bayesian analysis provided moderate evidence supporting the equivalence between the two conditions. In conclusion, this study demonstrated that the IB phenomenon can be generalized to multimodal (audiovisual) sensory outcomes, and visual information shows dominance in determining the magnitude of IB for audiovisual events.

Introduction
Review of the intentional binding effect
The intentional binding (IB) effect is the compression of subjective timing between a voluntary action and its sensory outcome (Haggard, Clark, & Kalogeras, 2002). Past studies show that, on average, the subjective timing of a voluntary action accompanied by a sensory outcome is delayed compared with when the action occurs without any sensory outcome. Conversely, the subjective timing of a sensory event caused by one's voluntary action is advanced compared with when the same sensory event happens in isolation (for a detailed review, see Moore & Obhi, 2012). The IB phenomenon is thought to reflect the sense of agency—that is, the feeling of controlling one's actions and outcomes—because it is typically observed during voluntary actions with sensory consequences. However, some studies suggest that it is the perceived causal connection between an action and its outcome that is responsible for this effect (e.g., Buehner, 2012; Suzuki, Lush, Seth, & Roseboom, 2019). Although the role of intentionality versus perceived causality in relation to the IB effect has been intensively studied in the past, little consensus has been reached. In this research, our attention turned towards a less explored element: the perceptual attributes of the sensory outcome. 
Goal of the study
In the past, most studies of the IB phenomenon focused on the unimodal sensory outcome (e.g., auditory, visual, or tactile) and only used basic stimuli (e.g., pure tone or flash). However, in daily lives, our actions trigger diverse and complex events, with information arising from different modalities. Whether the IB effect can be observed in a realistic context remains an open question. The current study is set to answer the question: Can IB for unimodal (audio) outcomes be generalized to multimodal (audiovisual) outcomes? In this study, we use a modified Libet clock as the instrument to probe the subjective timing for the sensory outcome, while depicting a dynamic physical event (collision) that involves both auditory and visual modalities. 
Experiment 1
Aims of experiment 1
Experiment 1 has two aims: first, to examine whether IB for the unimodal (auditory) event can be generalized to the multimodal (audiovisual) event, and to compare their magnitudes; and second, to examine whether the level of integration between modalities can affect IB. The integration of information from different modalities has been shown to influence time perception (Murai & Yotsumoto, 2018); however, its specific contribution to the IB effect remains uninvestigated. Experiment 1 presented two types of audiovisual events: one in which the auditory and visual information were temporally aligned to depict an integrated physical event and another where the information from the two modalities was presented in a parallel, unrelated manner. 
Design
Experiment 1 used a 2 × 3 repeated measures design. The first factor is action (passive/active), and the second factor is event (auditory/audiovisual integrated/audiovisual irrelevant). In the passive condition, participants passively waited for the sensory event and reported its onset time, which served as the baseline of subjective timing for different types of sensory events. In the active condition, participants triggered the sensory event by freely pressing a keyboard key. The variable of interest is the magnitude of the IB (for the outcome) effect for different types of sensory events. For each participant, the IB effect for each event type is calculated by subtracting the mean subjective timing in the passive condition from that in the active condition (IB = active − passive). A negative value indicates that the event was perceived as having occurred earlier when it was triggered by the keypress (i.e., outcome binding). 
Sample size determination
A power analysis was conducted to determine the required sample size. According to a report from one meta-analysis, the average effect size for outcome binding is 0.769 for auditory stimuli and 0.44 for visual stimuli (Tanaka, Matsumoto, Hayashi, Takagi, & Kawabata 2019). In the absence of parameters for audiovisual stimuli, we used the effect size for visual stimuli as a conservative estimate. The power was set at 0.85, and the power analysis indicated that at least 39 participants were needed. To fully counterbalance the experimental conditions, we decided to recruit 42 participants. The counterbalance was realized by a 6 × 6 Latin square to statistically control the effect of block order (Keppel, 1991). 
Participants
All participants in this study were recruited through an online sign-up form. All participants reported normal or corrected-to-normal vision and no color deficit, normal auditory perception, and no neurological history. All participants provided informed consent before participation and received payment for their participation. Procedures were approved by the Research Ethics Committee of National Taiwan University (202208HM015). In Experiment 1, we recruited 43 participants. The data from one participant were excluded immediately after the experiment owing to poor attention, and a lack of understanding of the task during debriefing. To maintain our sample size, one additional subject was recruited. Consequently, the data from 42 participants (29 females; mean age = 21.6 ± 2.63 years; range, 18–29 years) were included in the data analysis. 
Apparatus and stimuli
The experiment was conducted using the Psychopy package in a Python environment. It was presented on a Windows PC with a 27-inch IPS monitor (AORUS AD27QD) with a resolution of 1,920 × 1,080 and a 60-Hz refresh rate. The stability of the framerate was tested using a photodetector (EOT ET-2030) and a digital oscilloscope (Tektronix TBS1072C). The experiment script demonstrated great efficiency with a latency of less than 2 ms/1 second based on visual inspection of the oscilloscope display. 
The response was collected through a wireless mouse and a 7-key keyboard (Owlab Voice Mini, USB type C; size, 111 × 61 × 27mm). All keyboard keys except the response key are disabled by firmware. The sound stimuli were presented through a stereo speaker (Logitech Z150). The experiment was conducted in a soundproof room with a dim light, participant placed their head on a chin rest at a viewing distance of 60 cm. 
We used a modified version of the Libet clock, consisting of 120 gray discs, each with a visual angle of approximately 0.7°. To minimize interference between the target visual stimuli and the instrument for subjective timing, the conventional clock hand was replaced with a red disc superimposed on one of the gray discs. The red disc updated its location to the next gray disc in every frame to create a smooth clockwise motion. The center fixation was a small white dot (0.1°). See Figure 1 for the layout of the Libet clock. 
Figure 1.
 
Layout of the Libet clock. Note. A modified version of the Libet clock, consisting of 120 gray discs, each with a visual angle of approximately 0.7°. The center fixation is a small white dot (0.1°). The period of the rotating red disc was 2 seconds, and the radius of the clock was 14.5°. The red disc updates its location to the next gray disc in every frame to create a smooth clockwise motion.
Figure 1.
 
Layout of the Libet clock. Note. A modified version of the Libet clock, consisting of 120 gray discs, each with a visual angle of approximately 0.7°. The center fixation is a small white dot (0.1°). The period of the rotating red disc was 2 seconds, and the radius of the clock was 14.5°. The red disc updates its location to the next gray disc in every frame to create a smooth clockwise motion.
The period of the rotating red disc was 2 seconds, and the radius of the clock was 14.5°. The relatively large radius serves to provide enough space to depict a dynamic physical event (collision). This design was justified based on previous studies, which showed that the length of the clock hand does not significantly influence the magnitude of IB (Ivanof, Terhune, Coyle, Gottero, & Moore, 2022) and that the radius of the clock does not significantly affect the timing for the awareness of intentions (Ivanof, Terhune, Coyle, & Moore, 2022). 
Experimental task
In the passive–auditory condition, participants passively awaited the target sound stimulus and reported its onset time. At the beginning of each trial, the clock face was displayed for 500 ms, and the red disc appeared at a random position (1 of the 120 gray discs), starting to rotate clockwise. After a pseudo-randomized period (2.0, 2.5, 3.0, or 3.5 seconds), a keypress sound (duration 137 ms) was played, indicating that the computer had acted to trigger the subsequent target event. At 250 ms after the onset of the keypress sound, a target collision sound (duration 393 ms) was played. After the onset of the target sound was the wash-out period; the red disc continued to rotate for a pseudo-randomized period (1.0, 1.5, 2.0, or 3.0 seconds) and then disappeared while the clock face remained on the screen for an additional 500 ms. Participants were instructed to fixate on the center fixation point throughout the presentation stage, covertly attend to the rotating red disc, and memorize the position of the red disc when they heard the target collision sound. After the presentation stage, there was a 500-ms blank period before the clock face was shown again. Participants then reported the location of the moving red disc by moving the cursor to select one of the gray discs. During the response stage, participants were told to relax and allowed to move their eyes freely. 
In the active–auditory condition, the sequence was similar, except the collision sound was triggered by the participants’ voluntary action. In each trial, participants were instructed to freely press a designated key to trigger the target collision sound at any chosen time, provided they allowed the red disc to complete at least one rotation. Participants were advised to press the key spontaneously, without relying on a predetermined position of the red discs or any other type of countdown strategy. The interval between the participants’ keypress and the target collision sound remained the same (250 ms) as in the passive condition. 
In the passive–audiovisual integrated condition, when the red disc appeared, two colored discs (yellow and cyan) also appeared at two of the four possible locations relative to the fixation: lower left, lower right, upper left, or upper right. The distance between each colored disc and the center fixation was 2.85°. After the red disc rotated for a pseudo-randomized period (the same as in the passive–auditory condition), two colored discs were launched toward the center fixation by the computer. The keypress sound was presented simultaneously with the launch action. After 250 ms, the two colored discs met at the center and bounced away upon contact (elastic collision). The target collision sound was presented simultaneously at the moment of contact. After the collision, two color discs changed trajectories and continued moving for 1,000 ms before disappearing (see Figure 2 for the schematic of the two color discs’ spatial dynamic). After that, there was a wash-out period (same as in the auditory condition). The participant's primary task in audiovisual integrated conditions was the same as in the auditory conditions, which was to report the position of the red disc when they heard the target collision sound. They were informed that the timing of the target sound and the timing of the two discs making contact on the screen were the same. The participants were also required to perform a “trajectory judgment (TJ) task.” During the presentation, they were instructed to focus on the center of the screen, covertly track the movements of two discs, and memorize their vanishing points. After reporting the onset of the target sound, they were randomly asked to report the final location of one of the colored discs. The clock face was divided into four quadrants (four pie-like sectors colored gray), and participants reported the location by selecting one of the sectors with a cursor. When the cursor was positioned in a specific sector, that sector was highlighted in the same color as the disc they were asked to report (see Figure 3 for the layout of the TJ task). The purpose of the TJ Task is to ensure that participants pay attention to the visual components rather than ignoring them. Participants were instructed to monitor the rotating red disc and the movements of two colored discs throughout the presentation without tracking the objects by moving their eyes. They were also informed that both tasks were equally important and that they should strive to balance their performance across tasks, rather than focusing exclusively on one. In the active–audiovisual integrated condition, two colored discs were launched by the participant's keypress, and the rest of the sequence was identical to that in the passive condition. 
Figure 2.
 
Example of the spatial dynamic of the visual stimulus in the audiovisual conditions. Note. One example from the audiovisual (integrated) condition. When the red disc appeared, two colored discs (yellow and cyan) also appeared at two of the four possible locations relative to the fixation: lower left, lower right, upper left, and upper right. The distance between each colored disc and the center fixation was 2.85°. Two colored discs were launched toward the center fixation by the computer or the participant's key press. After 250 ms, the two colored discs met at the center and bounced away upon contact (elastic collision). See the Supplementary Material for the rest of the stimuli in the integrated and irrelevant conditions.
Figure 2.
 
Example of the spatial dynamic of the visual stimulus in the audiovisual conditions. Note. One example from the audiovisual (integrated) condition. When the red disc appeared, two colored discs (yellow and cyan) also appeared at two of the four possible locations relative to the fixation: lower left, lower right, upper left, and upper right. The distance between each colored disc and the center fixation was 2.85°. Two colored discs were launched toward the center fixation by the computer or the participant's key press. After 250 ms, the two colored discs met at the center and bounced away upon contact (elastic collision). See the Supplementary Material for the rest of the stimuli in the integrated and irrelevant conditions.
Figure 3.
 
Trajectory judgment (TJ) task. Note. The participants were required to perform a TJ task in conditions with visual stimuli (two color discs). During the presentation, they were instructed to focus on the center of the screen, covertly track the movements of two discs, and memorize their vanishing points. After reporting the onset of the target sound, they were randomly asked to report the final location of one of the colored discs. The clock face was divided into four quadrants (four pie-like sectors colored gray), and participants reported the location by selecting one of the sectors with a cursor. When the cursor was positioned in a specific sector, that sector was highlighted in the same color as the disc they were asked to report.
Figure 3.
 
Trajectory judgment (TJ) task. Note. The participants were required to perform a TJ task in conditions with visual stimuli (two color discs). During the presentation, they were instructed to focus on the center of the screen, covertly track the movements of two discs, and memorize their vanishing points. After reporting the onset of the target sound, they were randomly asked to report the final location of one of the colored discs. The clock face was divided into four quadrants (four pie-like sectors colored gray), and participants reported the location by selecting one of the sectors with a cursor. When the cursor was positioned in a specific sector, that sector was highlighted in the same color as the disc they were asked to report.
The audiovisual irrelevant conditions were similar to the audiovisual integrated condition, except the two colored discs were launched in pseudo-randomized directions, making the visual component unrelated to the target collision sound. There are 12 combinations of the two discs' movements: in 10 of these combinations, the discs move in straight paths without making contact; in 2 of the combinations, the discs temporally overlap and appear to stream through each other (see supplementary material for the video demo URL). The target collision sound is presented 250 ms after the discs are launched. For both the integrated and irrelevant conditions, the speed of moving discs is constant for a given direction: 4 pixels/frame when moving vertically or horizontally and 5.66 pixels/frame when moving diagonally. The key distinction between the integrated and irrelevant conditions is that in the integrated conditions, the auditory and visual modalities are integrated to depict a collision event, whereas in the irrelevant conditions, although the computer or participant triggers the onset of the target sound and visual dynamics simultaneously, the information from the two modalities is not perceptually related or part of an integrated physical event. Comparing the integrated and irrelevant conditions allows us to investigate whether the level of integration between modalities impacts the IB phenomenon. 
Procedure
Participants were first presented with the clock face and two sound stimuli to familiarize themselves with the setup. After that, they were presented with six practice blocks: (passive–auditory), (active–auditory), (passive–audiovisual integrated), (active–audiovisual integrated), (passive–audiovisual irrelevant), and (active–audiovisual irrelevant). Each practice block included detailed text instructions on the screen, and further explanations were provided by the experimenter if needed. Each practice block contained four trials. During instruction, participants were also asked whether they could see the rotating red dot and differentiate between the yellow and cyan discs. No participants reported difficulty in this study. After the practice session, the main experiment began. This consisted of six blocks, and the order of the blocks was counterbalanced across participants. Before the start of each block, participants were reminded of the corresponding task again through text. Each block comprised 24 trials, with participants completing a total of 144 trials for the main experiment. Participants could take as much rest as they desired between blocks. They were informed that a camera would be streaming the experiment and that the experimenter would occasionally check it to ensure everything was proceeding correctly. The entire experiment lasted about an hour. 
Analysis
Criteria for the valid trials
The rotation period of the red disc was 2 seconds, allowing for a response error ranging from −1 to +1 seconds. Owing to the circular nature of the clock, responses closer to this limit are more likely to produce an outcome opposite to the participant's subjective timing. For instance, if the red disc is at the 3 o'clock position when the target sound plays and the participant reports around the 10 o'clock position, one interpretation could be that the participant's subjective timing is advanced by approximately 800 ms. Another interpretation might be that it is delayed by around 1200 ms. To avoid such ambiguous data, we set a threshold where the absolute value of a subject's subjective timing error must not exceed 750 ms. Trials exceeding this criterion are considered possibly unrepresentative of the participants' subjective timing and are therefore excluded from the analysis. 
Filtering outlier trials
To remove trials with extreme values that might distort the mean estimation of subjective timing, we excluded responses for each participant that exceeded 2.5 standard deviations within each treatment combination (2 × 3). Previous studies have used such procedures and criteria (Ruess et al., 2017, Ruess et al., 2018). 
Mean subjective timing error
After rejecting invalid trials and filtering, the average subjective timing error for each participant within one experimental condition is calculated to yield the mean estimated error. A positive value in a given condition indicates that, on average, the participant's subjective timing for the target event is delayed, while a negative value suggests it is advanced. Consequently, each participant provides six mean subjective timing errors for each condition (active/passive × auditory/audiovisual integrated/audiovisual irrelevant). These data points will be further transformed into several contrast variables to test specific hypotheses. 
Planned contrasts
There are two sets, and a total of six planned contrasts will be performed after data collection. The first set tests for a significant IB effect for three types of events. The contrast variable, IB, is calculated by subtracting the mean subjective timing error in the passive condition from that in the active condition (IB = active × passive) for each type of event. 
The second set of contrasts compares the magnitude of IB across the three types of events. This is achieved by subtracting the IB values (from the first set of contrasts) between events, yielding three contrasts: auditory IB – audiovisual integrated IB, auditory IB – audiovisual irrelevant IB, and audiovisual integrated IB – audiovisual irrelevant IB. The first two contrasts test whether there is a difference in IB magnitude between auditory and audiovisual events; the third contrast assesses whether the level of integration between auditory and visual information impacts IB. 
Thus, six separated one-sample t-tests (df = 41, two-tailed) will be performed, with a significant level (a) of .05. These planned contrasts were built based on specific hypotheses; therefore, an omnibus ANOVA was not conducted, as it would provide little insight into our research question. Although these contrasts are not mutually orthogonal, they are designed to test a small set of hypotheses rather than perform post hoc comparisons based on experiment results. Correction for multiple comparisons is deemed unnecessary and might incorrectly ignore meaningful differences (Keppel, 1991). However, given the general concern regarding inflated type I errors in the literature, we have provided both original and adjusted p-values using the Holm–Bonferroni method for the readers' consideration. 
Pilot
A pilot with only the auditory condition was conducted before the main experiment to test the validity of the experimental apparatus. Two sound stimuli were used in this pilot, one being the ‘collision’ sound that will be used in this study and the other being a ‘whoosh’ sound effect, which was tested together for another experiment unrelated to the current study. The pilot consisted of two blocks, one for the active trials and the other for the passive trials, and the order of blocks was counterbalanced between participants. Two types of sound stimuli were presented randomly within each block. For this pilot, 20 participants were recruited. A significant IB effect was detected regardless of whether the ‘whoosh’ trials were included. auditory IB (all): t(19) = −8.574, p < 0.001, M = −136 ms, Cohen's d = −1.917; auditory IB (collision sound only): t(19) = −8.124, p < 0.001, M = −136 ms, Cohen's d = −1.817 (see Table 1 for summary statistics). The results of the pilot experiment indicate that the modified Libet clock in the study is suitable for measuring the IB effect. 
Table 1.
 
Pilot summary table.
Table 1.
 
Pilot summary table.
Experiment 1 results
TJ task performance
In the audiovisual conditions, participants were randomly asked to report the final location of one of the colored discs in the TJ task. All participants performed well, achieving mean accuracies of 0.955, minimum 0.733, standard deviation, 0.066, in the audiovisual integrated condition and 0.907, minimum 0.675, standard deviation, 0.081, in the audiovisual irrelevant condition. All participants' accuracies were above the chance level. Note that the chance level was conditional: in the audiovisual integrated condition, with only two possible vanishing points, the chance level was 50%. In the audiovisual irrelevant condition, where the discs had three possible vanishing points, the chance level was 33%. The chance level is defined in such a way based on the assumption that participants remember the initial position of the two colored discs. We did not remove trials with incorrect TJ task responses; because no such task existed in the auditory condition, removing these trials would result in fewer trials in the audiovisual conditions, thereby making the mean estimates less reliable, and the comparisons between conditions less valid from a psychometric perspective. The primary purpose of the TJ task was to ensure that participants paid attention to the visual component of the target event rather than ignoring it. 
IB for different types of events
Planned contrasts (n = 42) showed that for all types of events, there were significant IB effects, auditory IB: t(41) = −10.291, p < 0.001, M = −138 ms, Cohen's d = −1.588; audiovisual integrated IB: t(41) = −7.083, p < 0.001, M = −97 ms, Cohen's d = −1.093; audiovisual irrelevant IB: t(41) = −4.550, p < 0.001, M = −88 ms, Cohen's d = −0.702. The conclusion for all three types of events is consistent after the p value is adjusted for comparing a family of three with the Holm–Bonferroni method (all p holm < 0.001). 
For Experiment 1, approximately 4.5% of trials were excluded from the analysis because of the valid trial criteria and outlier filter (see Table 2 for summary statistics and Figure 4 for the descriptive plot). 
Table 2.
 
Experiment 1 summary table.
Table 2.
 
Experiment 1 summary table.
Figure 4.
 
Experiment 1 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Figure 4.
 
Experiment 1 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Difference in IB between events
The second set of contrasts (n = 42) revealed that the magnitude of IB was significantly weaker in both audiovisual integrated and audiovisual irrelevant conditions compared with auditory condition, auditory IB – audiovisual integrated IB: t(41) = 2.527, p = 0.015, M = 42 ms, Cohen's d = 0.39; auditory IB – audiovisual irrelevant IB: t(41) = 2.329, p = 0.025, M = 50 ms, Cohen's d = 0.359. The difference in the IB magnitude is not significant between audiovisual integrated and irrelevant conditions, audiovisual integrated IB–audiovisual irrelevant IB: t(41) = 0.541, p = 0.591, M = 9 ms, Cohen's d = 0.084. Note that, in these three comparisons, the difference values are converted to positive numbers, representing the difference in the magnitude of IB more intuitively. The result of the difference between auditory and other conditions is only marginally significant after the p value adjustment for a family of three with the Holm–Bonferroni method, auditory IB – audiovisual integrated IB: p holm = 0.046; auditory IB – audiovisual irrelevant IB: p holm = 0.050. auditory IB – audiovisual irrelevant IB: p holm = 0.591. Given the large effect size for the difference in IB between the auditory and the two audiovisual conditions, we conclude that the observed difference represents a meaningful effect (see Figure 5 for the descriptive plot). 
Figure 5.
 
Experiment 1 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event.
Figure 5.
 
Experiment 1 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event.
Discussion
In Experiment 1, we replicated the classical IB effect for auditory events. Additionally, we observed a significant IB effect for audiovisual events. These findings suggest that the IB phenomenon is not limited to unimodal events with basic features; it also extends to multimodal events that more closely resemble everyday experiences, thereby enhancing the ecological validity of the IB effect. Second, we found no evidence that the level of integration between auditory and visual information modulates the magnitude of IB. We also discovered that the IB magnitude for audiovisual events is weaker compared with the auditory event. In the next experiment, we further investigated the possible reasons behind such attenuation. 
Experiment 2
Aims of experiment 2
In Experiment 1, we observed that the magnitude of IB is weaker for audiovisual events compared with auditory events alone. This finding suggests several possibilities. First, the IB for audiovisual events may represent an average of the IB effects observed separately in auditory and visual modalities. A meta-analysis indicates that studies using visual stimuli generally report weaker IB effects compared with those using auditory stimuli (Tanaka et al., 2019). Additionally, another study comparing the magnitude of IB for auditory and visual events in a controlled experimental setup yielded similar results (Ruess et al., 2018). Therefore, the weaker IB in the audiovisual condition compared with the auditory condition might result from averaging the IB effects for the auditory and visual components. Second, the magnitude of IB in audiovisual events might be determined by the visual information (i.e., the visual dominance effect). Past research has demonstrated that when presented with audiovisual stimuli, participants exhibit a behavioral preference for the visual component (Colavita, 1974; Posner, Nissen, & Klein, 1976; Sinnett, Spence, & Soto-Faraco, 2007). Additionally, previous research has demonstrated that auditory information tends to dominate in temporal tasks, while visual information is more dominant in spatial tasks (see Repp & Penel, 2002, for a review). Given that our task uses a Libet clock, which has a spatial component, the observed attenuation therefore could be due to the process responsible for producing temporal compression also prioritizing visual information. Note that the traditional Libet clock is a visual timer that heavily relies on spatial localization; recently, several kinds of timers for other modalities (e.g., tactile and auditory) have been proposed (Cornelio Martinez, Maggioni, Hornbæk, Obrist, & Subramanian, 2018; Muth, Wirth, & Kunde, 2021); therefore, it is worth investigating whether the patterns observed in our study are consistent across different types of instruments or specific to the visual timer. 
Third, the weaker IB in audiovisual conditions might be due to cross-modality attenuation. This means that, when the sensory outcome includes information from different modalities, the magnitude of IB becomes weaker than in unimodal scenarios. 
In Experiment 2, we separated the components of the audiovisual event to discern these competing hypotheses. 
The first hypothesis states that the IB for the audiovisual event results from averaging the IB effects observed in the auditory and visual events. The prediction is that the auditory event should show the strongest IB, the visual event should show the weakest IB, and the IB effect for the audiovisual event should be somewhere in the middle. Specifically, the audiovisual IB should be significantly smaller than the auditory IB, replicating results from Experiment 1, but larger than the visual IB. 
The second hypothesis (visual dominance) predicts that the audiovisual and visual events will have the same magnitude of IB, and the auditory IB should be larger than the other two conditions. 
The third hypothesis states that the magnitude of the IB is attenuated in multimodal contexts. It predicts that the IB in the audiovisual condition will be weaker than in both the auditory and visual conditions. However, the second hypothesis does not predict a difference in IB magnitude between auditory and visual events. 
For the visualization of the three hypotheses, see Figure 6
Figure 6.
 
Hypothetical data of Experiment 2 results.
Figure 6.
 
Hypothetical data of Experiment 2 results.
Methods
Design and participants
The design of Experiment 2 was identical to Experiment 1 (2 × 3 repeated measure), except the audiovisual irrelevant condition was replaced with a visual condition: event (auditory/audiovisual/visual), the audiovisual condition was identical to the audiovisual integrated condition in Experiment 1. The same number of participants, n = 42, 21 females, mean age = 20.62 ± 1.67 years, range 18–27 years) were recruited. All participants have normal or corrected-to-normal vision. 
Apparatus and stimuli
The experimental setup is completely identical to that of Experiment 1. In the new visual condition, the visual component of the event was the same as in the audiovisual condition, where two colored discs were launched toward the center fixation point to depict a collision event. However, the collision sound was no longer presented; participants instead reported the position of the red disc when they saw the target collision. Note that, in the passive visual condition, a keypress sound was still presented simultaneously with the launch action. 
Experimental task and procedure
The experimental task was the same as in Experiment 1, where participants reported the subjective timing of the target event that was either triggered by the computer (passive) or their own action (active). Importantly, in the visual condition, participants were instructed to fixate on the center fixation, covertly attend to the two colored discs, and memorize the position of the rotating red dot when they saw the discs collide at the center. In both the audiovisual and visual conditions, participants also performed the TJ task. The instructions were the same as in Experiment 1: participants were told to fixate on the center during the presentation and covertly attend to the rotating red discs and the two colored discs. In the active conditions, they were advised not to use any countdown strategies and to balance their performance between the event timing task and the TJ task. 
Analysis
Similar to Experiment 1, two sets (a total of six) of planned contrasts will be performed after data collection. The first set tests for a significant IB effect for three types of events, and the second set of contrasts compares the magnitude of IB across the three types of events. The criteria for the valid trials and the outlier filter were identical to the previous procedure. 
Results
TJ task performance
All participants performed well, achieving mean accuracies of 0.976, minimum 0.913, standard deviation, 0.023, in the audiovisual condition and 0.967, minimum 0.833, standard deviation, 0.036, in the visual condition. All participants' accuracies were above the chance level. 
IB for different types of events
Planned contrasts (n = 42) showed that for all types of events, there were significant IB effects, auditory IB: t(41) = −9.952, p < 0.001, M = −118 ms, Cohen's d = −1.536; audiovisual IB: t(41) = −4.240, p < 0.001, M = −69 ms, Cohen's d = −0.654; visual IB: t(41) = −3.283, p = 0.002, M = −55 ms, Cohen's d = −0.507 (see Table 3 for summary statistics and Figure 7 for the descriptive plot). The conclusion for all three types of events is consistent after the p value is adjusted for comparing a family of three with the Holm–Bonferroni method, auditory IB: p holm < 0.001; audiovisual IB: p holm < 0.001; visual IB: p holm = 0.002. For Experiment 2, approximately 3% of trials were excluded from the analysis because of the valid trial criteria and outlier filter. 
Table 3.
 
Experiment 2 summary table.
Table 3.
 
Experiment 2 summary table.
Table 4.
 
LMM model terms table. TJ: accuracy of the Trajectory judgment (TJ) Task. For M6 and M7, the data only include audiovisual conditions, auditory condition was excluded.
Table 4.
 
LMM model terms table. TJ: accuracy of the Trajectory judgment (TJ) Task. For M6 and M7, the data only include audiovisual conditions, auditory condition was excluded.
Table 5.
 
LMM model comparison tables. Note. Analysis was conducted with the lme4 package in R. Refitting model(s) with ML (instead of REML).
Table 5.
 
LMM model comparison tables. Note. Analysis was conducted with the lme4 package in R. Refitting model(s) with ML (instead of REML).
Figure 7.
 
Experiment 2 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Figure 7.
 
Experiment 2 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Difference in IB between events
The second set of contrasts (n = 42) revealed that the magnitude of IB was significantly weaker in both audiovisual and visual conditions compared with auditory condition, auditory IB – audiovisual IB: t(41) = 2.704, p = 0.010, M = 49 ms, Cohen's d = 0.417; auditory IB –visual IB: t(41) = 3.097, p = 0.004, M = 63 ms, Cohen's d = 0.478. The difference in the IB magnitude is not significant between audiovisual and visual conditions, audiovisual IB – visual IB: t(41) = 0.788, p = 0.435, M = 14 ms, Cohen's d = 0.122. Note that, in these three comparisons, the difference values are converted to positive numbers, representing the difference in the magnitude of IB more intuitively. The conclusion for the difference in IB between events is consistent after the p value is adjusted for comparing a family of three with the Holm–Bonferroni method, auditory IB – audiovisual IB: p holm = 0.020; auditory IB – visual IB: p holm = 0.011. audiovisual IB –visual IB: p holm = 0.435 (see Figure 8 for the descriptive plot). 
Figure 8.
 
Experiment 2 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event. n.s., not significant.
Figure 8.
 
Experiment 2 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event. n.s., not significant.
Bayesian analysis for the evidence of equivalence
In Experiment 2, one of the hypotheses we want to test involves a statement of equivalence between two conditions: the visual dominance hypothesis states that the magnitude of IB for the audiovisual and visual conditions should be the same. Under the null hypothesis significance testing framework, it is problematic to claim that two conditions are the same based on nonsignificant results. Therefore, we conducted an additional Bayesian analysis; under the Bayesian framework, we can evaluate the evidence for and against the null hypothesis in terms of the Bayes factor. We conducted a Bayesian t test with JASP software. The null hypothesis is that there are no differences in the IB magnitude between the audiovisual and the visual condition; the alternative hypothesis is that there are differences in the IB magnitude between the two conditions. The results showed moderate evidence in favor of the null hypothesis, BF01 = 4.48 (default Cauchy prior, scale = 0.707). 
Discussion
In Experiment 2, we successfully replicated the IB effect for auditory and audiovisual events observed in Experiment 1. We also replicated the weaker IB effect for audiovisual events compared with auditory events. Furthermore, we tested the appropriate model that could describe the IB effect in an audiovisual context. Given the nonsignificant result of the audiovisual – visual IB contrast and the additional Bayesian analysis provide moderate evidence for the equivalence between the two conditions, we conclude that the experiment results support the visual dominance hypothesis, namely, that the visual component determines the magnitude of IB for audiovisual outcomes. 
General discussion
Summary of the study
The present study demonstrated that the IB effect can be generalized to multimodal events. Our example involved a dynamic physical event—a collision between two colored discs—that incorporated information from both auditory and visual modalities. This finding enhances the ecological validity of the IB phenomenon and supports the idea that IB can be observed not only with basic unimodal stimuli, but also with complex sensory events containing multimodal information, similar to our daily experiences. We also found no evidence that the level of integration—specifically, whether the auditory and visual information concurrently depicted a coherent physical event—could modulate the magnitude of IB. However, we emphasize that this result should not be interpreted as indicating that the IB effect is the same regardless of integration. Rather, we do not have enough information to make a decisive inference under the null hypothesis significance testing framework. For instance, one could view the absence of a difference between the two conditions as indicating a lack of true integration in the integrated condition, or suggest that the irrelevant condition remains integrated. We also acknowledge that several factors were not controlled between the integrated and irrelevant conditions, such as the predictability of the discs' movements and their spatial dynamics. More flexible methods, like Bayesian analysis or modeling approaches, are needed to determine the effect of sensory integration on IB. 
Additionally, the study tested possible models to describe the IB effect in audiovisual events, and the results showed moderate evidence supporting the visual dominance hypothesis. Importantly, the current conclusion is based on only one type of event (collision). Whether the visual modality is prioritized generally in determining the magnitude of IB for audiovisual events requires further investigation. 
Alternative explanations of the results
The goal of the study was to manipulate the modalities of the sensory outcome and compare their effects on the IB. While doing this, some factors might covary; therefore, possible explanations must be discussed to evaluate the strength of the study's conclusions. In this section, we examine several factors that might contribute to the observed differences in IB. 
Difference in cognitive load and mental efforts
The key difference between the auditory condition and the others is that, in every condition where the visual component (collision) is presented, participants perform a dual task: one task involves subjective timing and the other involves TJ. The dual task is inherently more challenging, which may diminish the sense of agency and thereby reduce IB. The current study did not survey the participants' perceived difficulty and sense of agency across conditions; therefore, whether the difference in IB could be attributed to variations in cognitive load warrants additional research. 
Differences in the visual attention on the clock hand
Across two experiments in this study, we observed that, whenever the target event involves visual information (the collision of two colored discs), the IB effect is weaker compared with the auditory condition. In the audiovisual/visual conditions, participants are required to attend covertly to both the movements of the two color discs at the center and the rotating red discs at the periphery, thereby dividing the resources of visual attention. Consequently, in these conditions, the rotating red disc is less attended to compared with the auditory condition, which might explain the weaker IB effect. Our response to finding this is that the amount of attention deployed on the red discs should, in principle, reflect the accuracy of the estimate, but not the bias of the estimate. In other words, the amount of visual attention allocated to the rotating red discs should negatively correlate with response variability (RV) within a condition for each participant: the less attention on the rotating red discs, the noisier the response becomes. It is unclear how the signal strength or the certainty of the position of the red disc would not only influence the accuracy of its location estimate, but also change the amount of bias (mean shift) during voluntary action. Another way to probe the relationship between visual attention and the magnitude of IB is to investigate whether individual differences in the accuracy of the TJ task in the audiovisual and visual conditions negatively correlate with IB. The rationale is that higher TJ task accuracy represents a greater allocation of resources to the center field of the clock face and, therefore, less attention is deployed to the peripheral rotation of the red discs. The potential moderating effects of RV and TJ task accuracy will be scrutinized in a later presented exploratory data analysis. 
Motion
The animation of the collision in the audiovisual and visual conditions invokes the perception of motion in addition to the rotating red disc, whereas in the auditory condition there are no additional visual dynamics. Past studies showed that the perception of motion alters the time perception (Brown, 1995), and the audiovisual simultaneity judgment is modulated by visual dynamics (Fouriezos et al., 2007). Although this effect may be consistent for the observatory and the self-caused event, there is a chance that the perception of motion influences IB through other mechanisms. For instance, because the speed of the moving discs remains constant in a given direction, this could provide extra predictive cues, allowing participants to infer timing based on the distance traveled rather than depending solely on the actual collision event. Additionally, it would be interesting to explore whether changing the speed of the target motion would affect the IB. 
Expectation and predictability of the visual outcome
One critical difference between the auditory condition and the rest is the absence of continuous cues linking action and outcome. In the audiovisual conditions, where the collision of two colored discs is anticipated, this expectation might modulate the amount of temporal shift in the active condition. However, one could argue that the results of Experiment 1 do not support this interpretation: In Experiment 1, there were no significant differences in the IB effect between the audiovisual integrated and audiovisual irrelevant conditions, and because the two colored discs move randomly, they do not provide reliable information about the onset time of the target collision sound. Therefore, the attenuation of the IB effect after adding a visual component should not be attributed to the anticipatory effect provided by the persistent visual cue between the action and sensory outcome. Nevertheless, whether the IB effect for the auditory outcome will also be attenuated if the gap between action and outcome is filled requires further investigation. 
Affordance
Affordance refers to the potential actions that an object or environment enables or suggests to an agent (Gibson, 1979). In the audiovisual and visual conditions, two color discs were presented to the participant and waited to be launched by the keypress; the visual cue might provide more perceived affordance, thereby changing the participant's motivation or eliciting stronger motor preparation for the keypress action. However, using affordance to explain the observed difference is somewhat counterintuitive, because higher affordance should result in a more intuitive and fluent interaction between the agent and the object, therefore predicting a higher sense of agency and stronger IB effect. Our data show the opposite result; the conditions with greater affordance (audiovisual/visual) have a weaker IB effect. How the affordance of the stimuli and the apparatus (Libet clock) itself modulates the IB effect points to another important research direction. 
Preparedness
Another possible factor to explain the result is the difference in decision time for action execution. The greater affordance in the audiovisual and visual conditions may initiate stronger and more fluent motor preparation, resulting in quicker key presses in the active condition. In contrast, the dual task requirement might also impose a higher cognitive load, therefore delaying the timing of the key press. In either scenario, differences in the average time from trial onset to the participant's keypress in the active condition (i.e., the preparedness effect) might contribute to the differences in the IB. In both experiments, the time duration from the beginning of the trial to the participants' keypress was recorded. Therefore, it is possible to test whether preparation time (PT) correlates with the IB magnitude. 
Exploratory data analysis with the linear mixed effect model
The effect of RV and PT
To scrutinize the possibility that the difference in the IB between events is driven by the difference in RV and PT, an exploratory data analysis was conducted. A model comparison approach was adopted to determine whether the type of event still has significant explanatory power over IB after two factors were controlled: RV and PT. A participant's RV is defined by the standard deviation of the subjective timing error within a specific type of event (across active and passive conditions). A participant's PT for each event is the mean duration from the start of the trial to the key press in the active condition. A model M0 with RV as a fixed effect factor and subject as the random effect cluster is compared against a model M1 with an additional fixed effect event. Likelihood ratio tests showed M1 was a better fit for the data after including event in the model terms; a model M2 with PT as a fixed effect factor and subject as the random effect cluster is compared against a model M3 with an additional fixed effector event, and likelihood ratio tests showed M3 was a better fit to the data after including event in the model terms. A model M4 with RV and PT as fixed effect factors and subject as the random effect cluster is compared against a model M5 with an additional fixed effect event, and likelihood ratio tests showed M5 was a better fit to the data after including event in the model terms. Overall, the analysis showed that, for Experiments 1 and 2, the event factor has significant explanatory power over IB after considering RV and PT individually and together (see Table 4 for the model term details and Table 5 for the model comparison summaries). 
The correlation between TJ task accuracy and IB
The relationship between TJ task accuracy and IB was analyzed separately because there are no TJ tasks in the auditory condition. A null model M6 with only the subject as a random effect cluster is compared against an M7 model with an additional fixed effect factor TJ to determine whether the accuracy of the TJ task covaries with the magnitude of the IB within the audiovisual conditions. The likelihood ratio test was not significant, Experiment 1 p = 0.9036, Experiment 2 p = 0.3353. Overall, there was no detectable relationship between the performance of the TJ task and IB magnitude. The results suggested that we currently have no evidence to support the notion that the observed difference between the auditory and audiovisual conditions was due to differences in the amount of visual attention deployed on the periphery. 
Implications of the study
The present study provides preliminary evidence that the magnitude of the IB effect is sensitive to the modality and the perceptual characteristics of the sensory outcome. This result impacts the convention of using IB as an implicit measure of the sense of agency (Moore, 2016). In an applied research context, a product developer might want to use IB as an objective index of users’ perceived control over the product. However, given that different modalities might influence the magnitude of IB, the product developer could draw incorrect conclusions about the user experience based on differences in IB when, in reality, these differences reflect the amount of information or the saliency of different modalities. In light of this finding, future researchers should be cautious about the modalities involved in sensory outcomes when using the IB to infer the sense of agency. 
Future directions
In this study, we investigated the IB of a specific multimodal (audiovisual) combination with a particular sensory event (collision). The idea that IB for audiovisual events is determined by visual information should be tested with other combinations of modalities and various sensory events. We recommend that future studies carefully control the several factors previously mentioned (e.g., motion, predictability, cognitive load) that might covary while manipulating outcome modalities. 
Second, in this study, we did not measure the participants’ sense of agency across conditions. However, because the participants were informed that they had total control over the sensory outcomes and the action–outcome interval was fixed throughout the experiment, we speculate there will be little or no difference in the explicit report of agency. Therefore, this would dissociate from the IB measurement. Future experiments should test the possibility of such dissociation to evaluate the validity of IB as an implicit measure of agency. 
Acknowledgments
Funding provided by the Yushan Fellow Program of the Ministry of Education (MOE) NTU-113V2013-1, National Science and Technology Council (STC 113-2628-H-002-004-), and National Taiwan University (111L9A00701). 
Commercial relationships: none. 
Corresponding author: Po-Jang (Brown) Hsieh. 
Address: Department of Psychology, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., South Building, Room 314, Taipei City 10617, Taiwan (R.O.C.). 
References
Brown, S. W. (1995). Time, change, and motion: The effects of stimulus movement on temporal perception. Perception & Psychophysics, 57(1), 105–116, https://doi.org/10.3758/BF03211853. [PubMed]
Buehner, M. J. (2012). Understanding the past, predicting the future: Causation, not intentional action, is the root of temporal binding. Psychological Science, 23(12), 1490–1497, https://doi.org/10.1177/0956797612444612. [PubMed]
Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16(2), 409–412, https://doi.org/10.3758/BF03203962.
Cornelio Martinez, P. I., Maggioni, E., Hornbæk, K., Obrist, M., & Subramanian, S. (2018). Beyond the Libet clock: Modality variants for agency measurements. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–14, https://doi.org/10.1145/3173574.3174115.
Fouriezos, G., Capstick, G., Monette, F., Bellemare, C., Parkinson, M., & Dumoulin, A. (2007). Judgments of synchrony between auditory and moving or still visual stimuli. Canadian Journal of Experimental Psychology /Revue Canadienne de Psychologie Expérimentale, 61(4), 277–292, https://doi.org/10.1037/cjep2007028. [PubMed]
Gibson, J. J. (1979). The ecological approach to visual perception (pp. xiv, 332). Boston: Houghton, Mifflin and Company.
Haggard, P., Clark, S., & Kalogeras, J. (2002). Voluntary action and conscious awareness. Nature Neuroscience, 5(4), 382–385, https://doi.org/10.1038/nn827. [PubMed]
Ivanof, B. E., Terhune, D. B., Coyle, D., Gottero, M., & Moore, J. W. (2022). Examining the effect of Libet clock stimulus parameters on temporal binding. Psychological Research, 86(3), 937–951, https://doi.org/10.1007/s00426-021-01546-x. [PubMed]
Ivanof, B. E., Terhune, D. B., Coyle, D., & Moore, J. W. (2022). Manipulations of Libet clock parameters affect intention timing awareness. Scientific Reports, 12(1), 20249. https://doi.org/10.1038/s41598-022-23513-1. [PubMed]
Keppel, G. (1991). Design and analysis: A researcher's handbook, 3rd ed. (pp. xiii, 594). Upper Saddle River, NJ: Prentice-Hall, Inc.
Moore, J. W. (2016). What is the sense of agency and why does it matter? Frontiers in Psychology, 7, 1272, https://doi.org/10.3389/fpsyg.2016.01272. [PubMed]
Moore, J. W., & Obhi, S. S. (2012). Intentional binding and the sense of agency: A review. Consciousness and Cognition, 21(1), 546–561, https://doi.org/10.1016/j.concog.2011.12.002. [PubMed]
Murai, Y., & Yotsumoto, Y. (2018). Optimal multisensory integration leads to optimal time estimation. Scientific Reports, 8(1), 13068, https://doi.org/10.1038/s41598-018-31468-5. [PubMed]
Muth, F. V., Wirth, R., & Kunde, W. (2021). Temporal binding past the Libet clock: Testing design factors for an auditory timer. Behavior Research Methods, 53(3), 1322–1341, https://doi.org/10.3758/s13428-020-01474-5. [PubMed]
Posner, M. I., Nissen, M. J., & Klein, R. M. (1976). Visual dominance: An information-processing account of its origins and significance. Psychological Review, 83(2), 157–171, https://doi.org/10.1037/0033-295X.83.2.157. [PubMed]
Repp, B. H., & Penel, A. (2002). Auditory dominance in temporal processing: New evidence from synchronization with simultaneous visual and auditory sequences. Journal of Experimental Psychology: Human Perception and Performance, 28(5), 1085–1099, https://doi.org/10.1037/0096-1523.28.5.1085. [PubMed]
Ruess, M., Thomaschke, R., & Kiesel, A. (2017). The time course of intentional binding. Attention, Perception, & Psychophysics, 79(4), 1123–1131, https://doi.org/10.3758/s13414-017-1292-y. [PubMed]
Ruess, M., Thomaschke, R., & Kiesel, A. (2018). Intentional binding of visual effects. Attention, Perception, & Psychophysics, 80(3), 713–722, https://doi.org/10.3758/s13414-017-1479-2. [PubMed]
Sinnett, S., Spence, C., & Soto-Faraco, S. (2007). Visual dominance and attention: The Colavita effect revisited. Perception & Psychophysics, 69(5), 673–686, https://doi.org/10.3758/BF03193770. [PubMed]
Suzuki, K., Lush, P., Seth, A. K., & Roseboom, W. (2019). Intentional binding without intentional action. Psychological Science, 30(6), 842–853, https://doi.org/10.1177/0956797619842191. [PubMed]
Tanaka, T., Matsumoto, T., Hayashi, S., Takagi, S., & Kawabata, H. (2019). What makes action and outcome temporally close to each other: A systematic review and meta-analysis of temporal binding. Timing & Time Perception, 7(3), 189–218, https://doi.org/10.1163/22134468-20191150.
Figure 1.
 
Layout of the Libet clock. Note. A modified version of the Libet clock, consisting of 120 gray discs, each with a visual angle of approximately 0.7°. The center fixation is a small white dot (0.1°). The period of the rotating red disc was 2 seconds, and the radius of the clock was 14.5°. The red disc updates its location to the next gray disc in every frame to create a smooth clockwise motion.
Figure 1.
 
Layout of the Libet clock. Note. A modified version of the Libet clock, consisting of 120 gray discs, each with a visual angle of approximately 0.7°. The center fixation is a small white dot (0.1°). The period of the rotating red disc was 2 seconds, and the radius of the clock was 14.5°. The red disc updates its location to the next gray disc in every frame to create a smooth clockwise motion.
Figure 2.
 
Example of the spatial dynamic of the visual stimulus in the audiovisual conditions. Note. One example from the audiovisual (integrated) condition. When the red disc appeared, two colored discs (yellow and cyan) also appeared at two of the four possible locations relative to the fixation: lower left, lower right, upper left, and upper right. The distance between each colored disc and the center fixation was 2.85°. Two colored discs were launched toward the center fixation by the computer or the participant's key press. After 250 ms, the two colored discs met at the center and bounced away upon contact (elastic collision). See the Supplementary Material for the rest of the stimuli in the integrated and irrelevant conditions.
Figure 2.
 
Example of the spatial dynamic of the visual stimulus in the audiovisual conditions. Note. One example from the audiovisual (integrated) condition. When the red disc appeared, two colored discs (yellow and cyan) also appeared at two of the four possible locations relative to the fixation: lower left, lower right, upper left, and upper right. The distance between each colored disc and the center fixation was 2.85°. Two colored discs were launched toward the center fixation by the computer or the participant's key press. After 250 ms, the two colored discs met at the center and bounced away upon contact (elastic collision). See the Supplementary Material for the rest of the stimuli in the integrated and irrelevant conditions.
Figure 3.
 
Trajectory judgment (TJ) task. Note. The participants were required to perform a TJ task in conditions with visual stimuli (two color discs). During the presentation, they were instructed to focus on the center of the screen, covertly track the movements of two discs, and memorize their vanishing points. After reporting the onset of the target sound, they were randomly asked to report the final location of one of the colored discs. The clock face was divided into four quadrants (four pie-like sectors colored gray), and participants reported the location by selecting one of the sectors with a cursor. When the cursor was positioned in a specific sector, that sector was highlighted in the same color as the disc they were asked to report.
Figure 3.
 
Trajectory judgment (TJ) task. Note. The participants were required to perform a TJ task in conditions with visual stimuli (two color discs). During the presentation, they were instructed to focus on the center of the screen, covertly track the movements of two discs, and memorize their vanishing points. After reporting the onset of the target sound, they were randomly asked to report the final location of one of the colored discs. The clock face was divided into four quadrants (four pie-like sectors colored gray), and participants reported the location by selecting one of the sectors with a cursor. When the cursor was positioned in a specific sector, that sector was highlighted in the same color as the disc they were asked to report.
Figure 4.
 
Experiment 1 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Figure 4.
 
Experiment 1 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Figure 5.
 
Experiment 1 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event.
Figure 5.
 
Experiment 1 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event.
Figure 6.
 
Hypothetical data of Experiment 2 results.
Figure 6.
 
Hypothetical data of Experiment 2 results.
Figure 7.
 
Experiment 2 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Figure 7.
 
Experiment 2 mean subjective timing error plot. Note. A positive value indicates participants perceived the sensory event later than its physical onset, and a negative value indicates participants perceived the sensory event earlier than its physical onset.
Figure 8.
 
Experiment 2 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event. n.s., not significant.
Figure 8.
 
Experiment 2 IB magnitude plot. Note. The magnitude of binding across different types of events. Error bars represent the 95% confidence interval; all events showed significant IB. Note that these values represent the difference in mean subjective timing between the active and passive conditions, the 0 value does not represent the physical onset time of the sensory event but the mean subjective timing in the passive condition within each type of event. n.s., not significant.
Table 1.
 
Pilot summary table.
Table 1.
 
Pilot summary table.
Table 2.
 
Experiment 1 summary table.
Table 2.
 
Experiment 1 summary table.
Table 3.
 
Experiment 2 summary table.
Table 3.
 
Experiment 2 summary table.
Table 4.
 
LMM model terms table. TJ: accuracy of the Trajectory judgment (TJ) Task. For M6 and M7, the data only include audiovisual conditions, auditory condition was excluded.
Table 4.
 
LMM model terms table. TJ: accuracy of the Trajectory judgment (TJ) Task. For M6 and M7, the data only include audiovisual conditions, auditory condition was excluded.
Table 5.
 
LMM model comparison tables. Note. Analysis was conducted with the lme4 package in R. Refitting model(s) with ML (instead of REML).
Table 5.
 
LMM model comparison tables. Note. Analysis was conducted with the lme4 package in R. Refitting model(s) with ML (instead of REML).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×