August 2010
Volume 10, Issue 7
Vision Sciences Society Annual Meeting Abstract  |   August 2010
Recognizing people from dynamic video: Dissecting identity information with a fusion approach
Author Affiliations
  • Alice O'Toole
    The University of Texas at Dallas
  • Samuel Weimer
    The University of Texas at Dallas
  • Joseph Dunlop
    The University of Texas at Dallas
  • Robert Barwick
    The University of Texas at Dallas
  • Julianne Ayyad
    The University of Texas at Dallas
  • Jonathan Phillips
    National Institute of Standards and Technology
Journal of Vision August 2010, Vol.10, 643. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Alice O'Toole, Samuel Weimer, Joseph Dunlop, Robert Barwick, Julianne Ayyad, Jonathan Phillips; Recognizing people from dynamic video: Dissecting identity information with a fusion approach. Journal of Vision 2010;10(7):643.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

The goal of this study was to measure the quality of identity-specific information in faces and bodies presented in natural video or as static images. Participants matched identity in stimulus pairs (same person or different people?) created from videos of people walking (gait videos) and/or conversing (conversation videos). We varied the type of information presented in six experiments and two control studies. In all experiments, there were three conditions with participants matching identity in two gait videos (gait-to-gait), two conversation videos (conversation-conversation), or across a conversation and gait video (conversation-gait). In the first set of experiments, participants saw video presentations of the face and body (Exp. 1), face with body obscured (Exp. 2); and body with face obscured (Exp. 3). In the second set, they saw the “best” extracted image of face and body (Exp. 4), face-only (Exp. 5); and body-only (Exp. 6). Identification performance was always best with both the face and body, although recognition from the face alone was close in some conditions. A video advantage was found for face and body and body-alone presentations, but not the face-alone presentations. In two control studies, multiple static images were presented. These studies showed that the video advantages could be explained by the extra image-based information available in the videos, in all but the gait-gait comparisons. To assess the differences in the identity information in the experiments, we used a statistical learning algorithm to fuse the participants' judgments for individual stimulus items across experiments. The fusion produced perfect identification when tested with a cross validation procedure. When the stimulus presentations were static, the fusion indicated that there was partially independent perceptual information available in the face and body and face-only conditions. With video presentations, partially independent perceptual information was available from the face and body condition and the body-only condition.

O'Toole, A. Weimer, S. Dunlop, J. Barwick, R. Ayyad, J. Phillips, J. (2010). Recognizing people from dynamic video: Dissecting identity information with a fusion approach [Abstract]. Journal of Vision, 10(7):643, 643a,, doi:10.1167/10.7.643. [CrossRef]
 TSWG/DOD to A. O'Toole.

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.