September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Predicting visual attention of human drivers boosts the training speed and performance of Autonomous Vehicles
Author Affiliations
  • Aldo Faisal
    Imperial College London
Journal of Vision September 2021, Vol.21, 2819. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Aldo Faisal; Predicting visual attention of human drivers boosts the training speed and performance of Autonomous Vehicles. Journal of Vision 2021;21(9):2819.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Autonomous driving agents deal with a complex skilled task for which humans train for a long period of time, relying heavily on their sensory, cognitive, situational intelligence, and motor skills developed over years of their life. Autonomous driving is focused on end-to-end learning of driving commands, however, perception and understanding of the environment remains the most critical challenge when evaluating situations. This is even more relevant in urban scenes, as they contain various distractions acting as visual noise that hinder the agent from understanding the situation correctly. Humans develop the skill of visual focus and identification of task-relevant objects from an early age. Information extracted from human gaze relevant to environment context can help the agent with this perception problem, injecting a wealth of information about human decision-making behaviour and helping agents focus on task-relevant features and ignore irrelevant information. We combine human gaze and features of task-relevant instances to enhance perception systems for autonomous driving. We use a virtual reality headset with built-in eye-trackers for participants (n=9) to use, providing us with human driving gaze data. Based on this, we build a human driving visual attention predictor. Our integrated object detector identifies relevant instances on the road while human visual attention prediction determines which objects are most relevant to the human driving policy. We present the results of using this architecture with imitation learning and reinforcement learning autonomous driving agents, and compare them with baseline end-to-end methods, showing improved performance, 28% in imitation and 11% in reinforcement learning, accelerated training and explainable behaviour with our approach. Our results highlight the potential of human in-the-loop approaches for autonomous systems which, as opposed to end-to-end approaches, allow us to make use of human skills in creating AI that closes the loop by augmenting humans in an efficient and explainable manner.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.