September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Perceptual inference of dynamic emotion in natural movies
Author Affiliations
  • Zhimin Chen
    Department of Psychology, University of California, Berkeley, Berkeley, CA, USA
  • David Whitney
    Department of Psychology, University of California, Berkeley, Berkeley, CA, USA
Journal of Vision August 2017, Vol.17, 913. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Zhimin Chen, David Whitney; Perceptual inference of dynamic emotion in natural movies. Journal of Vision 2017;17(10):913. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Emotion recognition is a critical function of vision. It seems intuitive that to perceive a person's emotions, we just need to focus directly on that person—their face or body. However, sometimes the context in which a person has an emotion may be key to understanding that emotion. Can a person's emotion be dynamically inferred from contextual visual information, even without face and body-related information? We tested the ability to infer and track the emotions of people based solely on visual situational context, without any information about facial expression. Thirty-one observers watched silent movie clips of two characters interacting. The face and body of a randomly chosen character were occluded (target); the other character in the movie clip (partner) remained visible. Observers tracked the inferred emotion of the masked (invisible) target and reported the emotion by moving a mouse pointer in a valence-arousal (2D) space continuously, in real-time. Baseline ratings of the target and partner characters were established by asking a separate group of 69 observers to track the target's emotion when all characters were visible (unoccluded). In the baseline, observers agreed strongly when tracking the visible target's emotion (mean Cronbach's alpha = 0.95). More importantly, observers accurately inferred and tracked the emotion of the invisible target character, when compared to the baseline (mean Spearman's rho = 0.58, p < .01; mean absolute deviation = 8.5%). Cross-correlation analyses showed that inferring emotion based on context alone was as fast as tracking emotion using face and body information (no significant non-zero time lag). More strikingly, observers inferred the intensity of the target's emotion (arousal) accurately by using contextual ensemble information, not simply by tracking the partner's arousal (partial correlation = 0.42, p < .01). Our results demonstrate that observers can infer and track emotion accurately and speedily in real time based entirely on contextual information.

Meeting abstract presented at VSS 2017


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.