Abstract
How do we use cross-modal cues to accurately identify the objects and scenes we see and hear? Furthermore, how do the different sensory processes influence each other in these identification processes? Participants were presented with auditory contexts for 5 s before a target object was briefly presented. Observers then identified both the auditory scene and the visual object. The aforementioned questions were examined with objects presented in high and low spatial frequency, in either congruent, incongruent, or neutral (white noise) contextual relations. Additionally, two levels of object and contextual constraints, defined in a pilot study, were examined. Auditory scenes and visual objects more easily (i.e., more accurately) identified were categorized as a "strong" stimuli and were paired with each other. Less accurately identified auditory scenes were paired with less accurately identified visual objects and were categorized as "weak" (ambiguous) stimuli. First results concern object identification. When paired with a strong auditory context, congruently paired objects were more accurately identified than both incongruent and neutral contexts. These results were similar across spatial frequency. With weak contexts, the question was, could two weak sources of information (e.g., scene and object) combine to facilitate identification? The data suggests that such effects were not present. In the main experiment, with additional experiments for power, there were no advantages for congruent contexts over incongruent or neutral contexts. However, there was an unexpected main effect of spatial frequency for these "weak" stimuli: high spatial frequency objects were better identified across all contextual relational conditions. These results are in contrast to the strong constraint stimuli. There was a small reciprocal effect for auditory scene identification. Congruent auditory scenes were somewhat better identified than incongruent conditions. These results provide new information about the detailed interactions between sources of information in multimodal identification.
Meeting abstract presented at VSS 2012