September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Time to consider time: Comparing human reaction times to dynamical signatures from recurrent vision models on a perceptual grouping task
Author Affiliations
  • Alekh Karkada Ashok
    Brown University
  • Lore Goetschalckx
    Brown University
  • Lakshmi Narasimhan Govindarajan
    Brown University
  • Aarit Ahuja
    Brown University
  • David Sheinberg
    Brown University
  • Thomas Serre
    Brown University
Journal of Vision September 2024, Vol.24, 225. doi:https://doi.org/10.1167/jov.24.10.225
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Alekh Karkada Ashok, Lore Goetschalckx, Lakshmi Narasimhan Govindarajan, Aarit Ahuja, David Sheinberg, Thomas Serre; Time to consider time: Comparing human reaction times to dynamical signatures from recurrent vision models on a perceptual grouping task. Journal of Vision 2024;24(10):225. https://doi.org/10.1167/jov.24.10.225.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

To make sense of its retinal inputs, our visual system organizes perceptual elements into coherent figural objects. This perceptual grouping process, like many aspects of visual cognition, is believed to be dynamic and at least partially reliant on feedback. Indeed, cognitive scientists have studied its time course through reaction time measurements (RT) and have associated it with a serial spread of object-based attention. Recent progress in biologically-inspired machine learning, has put forward convolutional recurrent neural networks (cRNNs) capable of exhibiting and mimicking visual cortical dynamics. To understand how the visual routines learned by cRNNs compare to humans, we need ways to extract meaningful dynamical signatures from a cRNN and study temporal human-model alignment. We introduce a framework to train, analyze, and interpret cRNN dynamics. Our framework triangulates insights from attractor-based dynamics and evidential learning theory. We derive a stimulus-dependent metric, ξ, and directly compare it to existing human RT data on the same task: a grouping task designed to study object-based attention. The results reveal a “filling-in” strategy learned by the cRNN, reminiscent of the serial spread of object-based attention in humans. We also observe a remarkable alignment between ξ and human RT patterns for diverse stimulus manipulations. This alignment emerged purely as a byproduct of the task constraints (no supervision on RT). Our framework paves the way for testing further hypotheses on the mechanisms supporting perceptual grouping and object-based attention, as well as for inter-model comparisons looking to improve the temporal alignment with humans on various other cognitive tasks.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×