December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Eye-BEHAVIOR: An Eye-Tracking Dataset for Everyday Household Activities in Virtual, Interactive, and Ecological Environments
Author Affiliations
  • Cem Gokmen
    Stanford University
  • Ruohan Zhang
    Stanford University
  • Sanjana Srivastava
    Stanford University
  • Chengshu Li
    Stanford University
  • Michael Lingelbach
    Stanford University
  • Roberto Martín-Martín
    Stanford University
  • Silvio Savarese
    Stanford University
  • Jiajun Wu
    Stanford University
  • Li Fei-Fei
    Stanford University
Journal of Vision December 2022, Vol.22, 3819. doi:https://doi.org/10.1167/jov.22.14.3819
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Cem Gokmen, Ruohan Zhang, Sanjana Srivastava, Chengshu Li, Michael Lingelbach, Roberto Martín-Martín, Silvio Savarese, Jiajun Wu, Li Fei-Fei; Eye-BEHAVIOR: An Eye-Tracking Dataset for Everyday Household Activities in Virtual, Interactive, and Ecological Environments. Journal of Vision 2022;22(14):3819. https://doi.org/10.1167/jov.22.14.3819.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

We present a large-scale dataset of human body and eye movements in 3D, interactive, and ecological household VR environments. Our simulation platform BEHAVIOR includes 100 essential daily activities according to the American Time Use Survey, such as cleaning and cooking. The simulator (iGibson 2.0) provides photo-realistic, fully interactive 3D scenes reconstructed from real homes, and supports physically realistic articulated objects. The dataset contains 500 trials (5 trials per activity), collected from 8 subjects, in 15 virtual scenes with 244 different types of objects. The total length is 18 hours, with each trial ranging from 6 seconds to 11 minutes. We provide rich annotations for visual stimuli, including depth, surface normal, optical flow, object segmentation, and object poses. Additionally, we extract abstract state information, such as object relations, represented in a symbolic logic language (e.g., sliced(objectA), onTop(objectB, objectC)). The goal of each activity is encoded in this logic representation, which facilitates understanding the low-level behaviors as evidence of high-level reasoning towards the completion of a task. Our initial results show that this dataset introduces several challenges to modeling attentional control: 1) Distribution of gaze indicates strong top-down task modulation of attention, as 34% of gaze are on goal objects. 2) 3D scenes with diverse and visually rich objects challenge the most advanced saliency models. 3) The long-horizon activities involve navigation and object manipulation, capturing diverse cognitive abilities such as physical scene understanding, visual search, and eye-hand coordination. We propose to tackle these challenges by reconciling bottom-up sensory inputs and top-down task signals, and leveraging state-of-the-art machine learning models. Together with the dataset, we open-source our simulation environment and tools. Researchers can modify or create their own activities by adding new objects and designing new household environments. We hope this will make BEHAVIOR an appealing VR experimental platform.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×