May 2008
Volume 8, Issue 6
Vision Sciences Society Annual Meeting Abstract  |   May 2008
The neural representation of dynamic real-world auditory/visual events
Author Affiliations
  • Jean Vettel
    Cognitive & Linguistic Sciences, Brown University
  • Julia Green
    Cognitive & Linguistic Sciences, Brown University
  • Laurie Heller
    Cognitive & Linguistic Sciences, Brown University
  • Michael Tarr
    Cognitive & Linguistic Sciences, Brown University
Journal of Vision May 2008, Vol.8, 1054. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Jean Vettel, Julia Green, Laurie Heller, Michael Tarr; The neural representation of dynamic real-world auditory/visual events. Journal of Vision 2008;8(6):1054.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Events in the world are inherently multimodal. A ball bouncing provides correlated auditory and visual information to the senses. How are such events neurally represented? One possibility is that these distinct sources are integrated into a coherent percept of the event. Alternatively, auditory and visual information may be separably represented, but linked via semantic knowledge or their correlated temporal structure. We investigated this using event-related fMRI. Participants viewed and/or heard 2.5s environmental events, for example, paper ripping or door knocking, in two unimodal and three multimodal conditions:

Auditory only (ripping sound) Visual only (movie of paper ripping) Congruent Auditory/Visual (sound + movie of same instance) Semantically Incongruent A/V (ripping sound + movie of knocking) Temporally Incongruent A/V (ripping sound + movie of different ripping instance)

Of interest is the encoding of Congruent and Semantically Incongruent A/V events. The integrated proposal predicts sensory brain regions showing a differential response to semantic incongruencies, while under the separate representation account, there should be no difference. Critically, this multimodal response must be stronger than the responses for unimodal stimuli. We also consider whether integration processes function at the level of semantic congruity or at a fine-grained temporal level that binds sound to vision. The Congruent and Temporally Incongruent comparison addresses whether integrated multimodal responses arise due to A/V information from within the same semantic category. Alternatively, a single event gives rise to a high correlation between onsets, offsets, and temporal structures between domains. Such correlated information may be the “glue” that allows the brain to combine perceptually-distinct information into coherent representations of events. Preliminary results provide support for the perceptual integration of auditory and visual information originating from a common source, that is, one in which there is a correlation between the temporal structure across modalities.

Vettel, J. Green, J. Heller, L. Tarr, M. (2008). The neural representation of dynamic real-world auditory/visual events [Abstract]. Journal of Vision, 8(6):1054, 1054a,, doi:10.1167/8.6.1054. [CrossRef]

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.