September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
The Embodied Semantic Fovea - real-time understanding of what and how we look at things in-the-wild
Author Affiliations & Notes
  • Aldo A Faisal
    Brain & Behaviour Lab
    Dept. of Bioengineering
    Dept. of Computing
    Data Science Institute
  • John A Harston
    Brain & Behaviour Lab
    Dept. of Bioengineering
  • Chaiyawan Auepanwiriyakul
    Brain & Behaviour Lab
    Dept. of Computing
  • Mickey Li
    Brain & Behaviour Lab
    Dept. of Computing
  • Pavel Orlov
    Brain & Behaviour Lab
    Dept. of Bioengineering
Journal of Vision September 2019, Vol.19, 51a. doi:https://doi.org/10.1167/19.10.51a
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Aldo A Faisal, John A Harston, Chaiyawan Auepanwiriyakul, Mickey Li, Pavel Orlov; The Embodied Semantic Fovea - real-time understanding of what and how we look at things in-the-wild. Journal of Vision 2019;19(10):51a. https://doi.org/10.1167/19.10.51a.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Natural gaze behaviour is highly context-driven, controlled both by bottom-up (external visual stimuli) and top-down saliency (task goals and affordances). Whilst quantitative descriptions are easily obtainable in highly constrained tasks, ecological validity is an issue as gaze behaviour is richer in natural environments with free locomotion, head and eye movements. Whilst much work has been done on qualitative description of these natural dynamics, quantitative analysis of the spatiotemporal characteristics of natural gaze behaviour has proven difficult, due to the time-intensive nature of manual object identification and labelling. To address this issue we present ‘Embodied Semantic Fovea’, a proof-of-concept system for real-time detection and semantic labelling of objects from a head-mounted eye-tracker’s field-of-view, complemented by SLAM-based reconstruction of the 3D scene (Li et Faisal, ECCV WS, 2018) and 3D gaze end-point estimation (Orlov et Faisal, ETRA 2018). Our system reconstructs the entire visual world from surface elements (surfels), allowing us to understand not just how gaze moves across specific surfels, but also how surfels aggregate for a given object class. With this it is possible to track eye movements that land on an object, even as the wearer moves freely around the object, allowing us to understand eye movements from perspectives that are usually not possible (e.g. understanding that we are engaging with the same tree but from opposing directions). This allows systematic measurement of real-world scenario inter-object affordances for people in freely-moving environments. Using pixel-level object labelling with depth estimation from an egocentric scene camera, with simultaneous pupil tracking, we can superimpose gaze vectors over 3D-reconstructed object surfaces, providing the first three-dimensional object gaze reconstruction in freely-moving environments. We thereby produce ‘object maps’ of objects’ locations in relation to other objects, and understand how gaze moves between them, allowing true interrogation of three-dimensional gaze behaviour in freely-moving environments.

Acknowledgement: EPSRC & H2020 Project Enhance www.enhance-motion.eu 
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×