September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Saliency Map Classification Using Capsule-based CNNs
Author Affiliations
  • Michael Kleiman
    Florida Atlantic University
  • William Hahn
    Florida Atlantic University
  • Elan Barenholtz
    Florida Atlantic University
Journal of Vision September 2018, Vol.18, 1209. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Michael Kleiman, William Hahn, Elan Barenholtz; Saliency Map Classification Using Capsule-based CNNs. Journal of Vision 2018;18(10):1209.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Inference of task based on eye movements, known as the inverse Yarbus process, is challenging. Previous approaches have used aggregate measures, such as mean fixation duration or saccade velocity as well as hand-selected areas of interest (AOIs), with classification accuracies above chance but with high loss rates. Here, we used Capsule-based Convolutional Neural Networks (CapsuleNets), a recently introduced modification of convolutional neural networks (CNNs), to identify a participant's task (counting objects vs aesthetic judgement) based only on saliency maps derived from raw eye fixation coordinate data. Traditional CNNs have been widely used for classification of visual data, including types of flowers or species of animals, and are highly accurate in many situations. CNNs function analogously to the human visual system, with highly sparse representations such as edges or patches leading to progressively more specified layers such as faces or noses, and ultimately followed by subject classification based on the activation of each layer. However, CNNs suffer from an inability to process spatial information, which restricts their utility for ambiguous or irregular images, or images where spatial information is especially important -- both of which are the case when discriminating between saliency maps of varying tasks. By introducing capsules, the neural network is able to utilize spatial locations of features such as a cluster of fixation points around an expected search target versus a more spread out cluster for a non-target. Results show that CapsuleNets improve accuracy rates and minimize loss for saliency map classification by up to 35% compared to traditional CNNs. This method of saliency map analysis provides a method for classification of eye movement data that is highly generalizable to different tasks and stimuli types.

Meeting abstract presented at VSS 2018


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.