October 2003
Volume 3, Issue 9
Free
Vision Sciences Society Annual Meeting Abstract  |   October 2003
Explicit and implicit perceptual discrimination of videorealistic speech
Author Affiliations
  • Gadi Geiger
    CBCL, McGovern Inst.,Brain & Cog. Sci., MIT, Cambridge, MA, USA
  • Tony Ezzat
    CBCL, LCS, MITCambridge, MA, USA
  • Tomaso Poggio
    CBCL, McGovern Inst., Brain & Cog. Sci., MIT, Cambridge, MA, USA
Journal of Vision October 2003, Vol.3, 773. doi:https://doi.org/10.1167/3.9.773
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Gadi Geiger, Tony Ezzat, Tomaso Poggio; Explicit and implicit perceptual discrimination of videorealistic speech. Journal of Vision 2003;3(9):773. https://doi.org/10.1167/3.9.773.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Visual speech animation techniques which are “videorealistic” (potentially indistinguishable from real recorded video) are starting to become available. We describe here a perceptual evaluation scheme and its application to a new videorealistic visual-speech animation system, called Mary101.

Two types of experiments were performed: a) distinguishing visually between real and synthetic image-sequences of the same utterances (“Turing tests”), and b) lip-reading real and synthetic image-sequences of the same utterances (“Intelligibility tests”). In the explicit perceptual discrimination task (a), each stimulus is classified directly as a real or synthetic image-sequence by detecting a possible difference between the synthetic and the real image-sequences. The implicit perceptual discrimination (b) consists of a comparison between visual recognition of speech of real and synthetic image-sequences.

Subjects performed at chance level in the first explicit discrimination task (a). However, in the implicit lip-reading discrimination task (b), the same subjects performed significantly better with real image-sequences than with synthetic ones. This was true with recognition of whole-words, syllables and phonemes. This suggests that the latter task is a more sensitive method for discrimination between synthetic and real image-sequences.

Geiger, G., Ezzat, T., Poggio, T.(2003). Explicit and implicit perceptual discrimination of videorealistic speech [Abstract]. Journal of Vision, 3( 9): 773, 773a, http://journalofvision.org/3/9/773/, doi:10.1167/3.9.773. [CrossRef]
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×