August 2014
Volume 14, Issue 10
Vision Sciences Society Annual Meeting Abstract  |   August 2014
Pictorial Human Spaces: How Well do Humans Perceive a 3D Articulated Pose?
Author Affiliations
  • Elisabeta Marinoiu
    Institute of Mathematics of the Romanian Academy
  • Dragos Papava
    Institute of Mathematics of the Romanian Academy
  • Cristian Sminchisescu
    Department of Mathematics, Faculty of Engineering, Lund University
Journal of Vision August 2014, Vol.14, 757. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Elisabeta Marinoiu, Dragos Papava, Cristian Sminchisescu; Pictorial Human Spaces: How Well do Humans Perceive a 3D Articulated Pose?. Journal of Vision 2014;14(10):757. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

When shown a photograph of a person, humans have a vivid, immediate sense of 3D pose awareness and a rapid understanding of the subtle body language, personal attributes, or intentionality of that person. How can this happen and what do humans perceive? How accurate are they? Our aim is to unveil the process and level of accuracy involved in 3D perception of people from images by assessing the human performance. Our approach to establishing an observation-perception link is to make humans re-enact the 3D pose of another person (for which ground truth is available), shown in a photograph, following a short exposure time of 5 seconds. Our apparatus simultaneously captures human pose and eye movements during the pose re-enacting performance. In the process of perceiving and reproducing the pose, subjects attend firstly upper body joints with a general trend of focusing more on extremities than internal joints. Although the resulting scanpaths are pose-dependent, they are quite stable across subjects both spatially and sequentially. Our study reveals that people are not significantly better at re-enacting 3D poses given visual stimuli, on average, than existing computer vision algorithms. Errors in the order of 10°-20° or 100mm per 3D body joint position are not uncommon. The contribution of our work can be summarized as follows: (1) the construction of an apparatus relating the human visual perception with 3D ground truth; (2) the creation of a dataset (publicly available) collected from 10 subjects, containing 120 images of humans in different poses, both easy and difficult, and (3) quantitative analysis of human eye movements, 3D pose reenactment performance, error levels, stability, correlation as well as cross-stimulus control, in order to reveal how different 3D configurations relate to the subject focus on certain features in images, in the context of the given task.

Meeting abstract presented at VSS 2014


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.