We hypothesize that the enlargement of the SW for stimuli occurring near the body depends on a mandatory and additional recoding of external stimuli, which are initially processed exclusively in native receptor-dependent reference frames (i.e., head-centered for auditory events and eye-centered for visual events) into a more general and body-centered reference frame. External auditory and visual stimuli that are presented within PPS are processed by dedicated neural systems and interact with somatosensory representations in order to detect or anticipate contact with the body (Brozzoli, Gentile, & Ehrsson,
2012; Graziano & Cooke,
2006; Kandula, Hofman, & Dijkerman,
2015; Makin, Holmes, Brozzoli, & Farnè,
2012; Rizzolatti et al.,
1997). Thus, external stimuli close to the body may be automatically integrated into a body-centered reference frame to more efficiently enable visuo-tactile and audio-tactile interactions. Indeed, interaction with and manipulation of external auditory or visual signals requires transformation from the native reference frame to the reference frame of either a particular body part or the body as a whole (Andersen & Buneo,
2002; Andersen,
1995; Andersen, Snyder, Bradley, & Xing,
1997; Colby, Duhamel, & Goldberg,
1993; Rizzolatti et al.,
1997; Sereno & Huang,
2014). The computational requirement for reference frame transformations is indeed necessary to maintain receptive fields anchored on a particular body part as has been shown for the receptive fields of neurons presumed to be encoding PPS (Avillac, Denève, Olivier, Pouget, & Duhamel,
2005). The prediction here, thus, is based on the assumption that larger simultaneity temporal windows within PPS is representative of a greater window within which integration would occur and the observation that when stimuli are within PPS this integration happens (among other places) in neurons with receptive fields anchored on different body parts as well as the whole body. Within such a framework, the addition of further processing steps (or discrepancies in the form of reconciling representations in both native reference frames and body-centered reference frames) for a given stimulus modality (e.g., auditory) would result in an enlarged SW that is selectively driven by the stimulus needing additional processing. In other words, when audiovisual stimuli are presented close to the participant, they are processed not only in a native reference frame (eye-centered for visual and head-centered for auditory), but also in a reference frame taking into account body posture. Thus, when eye-centered, head-centered, and body-centered reference frames are misaligned, the reconciliation between these representations is most readily manifested in additional computation for stimuli coded in the modality whose native reference frame is misaligned from the body-centered reference frame, i.e., visual stimuli in the case of misaligned eye direction and auditory stimuli in the case of misaligned head direction. We suggest that in order to correctly bind information emanating from a common source, the neural processing scheme may allow for a greater temporal lead by the part of the sensory modality for which the native reference frame has been misaligned with respect to the body (and thus may require additional processing time). That is, we conjecture that the computation performed in order to make SJs may adapt as to correct for physiological lags due to reference frame transformation/reconciliation (see Leone & McCourt,
2013,
2015, for a similar argument). In the second experiment, we test this prediction by dissociating either the auditory or the visual reference frame with respect to the body-centered reference frame. The hypothesis for this manipulation is that larger SWs will be seen inside PPS than outside (as in
Experiment 1) but, more importantly, that these enlargements will be driven selectively by a larger ASW in the case of head/body misalignment (head tilted with respect to the whole body/trunk) and a larger VSW in the case of eyes/body misalignment (eyes in orbit tilted with respect to the whole body/trunk and head). This argument is bolstered by evidence that some visuo-tactile neurons have visual receptive fields anchored on the body, suggesting that the encoding of visual stimuli does not need to be transformed from eye-centered to head-centered to body-centered coordinates but may be directly encoded from eye-centered coordinates into body-centered coordinates (Avillac et al.,
2005).