September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Integrating vision and decision-making models with end-to-end trainable recurrent neural networks
Author Affiliations
  • Yu-Ang Cheng
    Brown University
  • Ivan Felipe Rodriguez
    Brown University
  • Takeo Watanabe
    Brown University
  • Thomas Serre
    Brown University
    Carney instititue for Brain Science
Journal of Vision September 2024, Vol.24, 775. doi:https://doi.org/10.1167/jov.24.10.775
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Yu-Ang Cheng, Ivan Felipe Rodriguez, Takeo Watanabe, Thomas Serre; Integrating vision and decision-making models with end-to-end trainable recurrent neural networks. Journal of Vision 2024;24(10):775. https://doi.org/10.1167/jov.24.10.775.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Historically, research in visual perception and perceptual decision-making has been pursued independently. Models of visual perception have primarily focused on developing neurocomputational mechanisms for visual processing, particularly in object and face recognition. These models, however, largely approximate only human accuracy levels, not fully utilizing reaction time data. In contrast, decision-making models have sought to replicate both accuracy and reaction times in human behavior, but they do not adequately address underlying visual processing mechanisms. Here, we bridge this gap and introduce an integrated end-to-end trainable recurrent neural network model. First, we optimize a vision module, a convolutional neural network, for a well-known perceptual decision-making task, i.e., the random dot motion task (Britten et al., 1992). We show that fitting a straightforward nonlinear reaction time function (Goetschalckx et al., 2023) to the vision module outputs fails to capture the distributions of human reaction times for the same task. However, fitting the drift-diffusion model (Ratcliff & Rouder, 1998), a traditional cognitive model significantly improves the goodness of fit. We further turn to a discrete-time recurrent neural network (RNN) approximation of the Wong-Wang circuit (Wong & Wang, 2006) for decision-making, which we optimize end-to-end together with the vision module using human behavioral data. We show that this combination offers a better fit for experimental data. In addition, analyzing the weights of the resulting model yields novel insights about the underlying integration process’s time course and the image features driving these decisions. Our integrated RNN model of vision and decision-making represents a first step towards a complete computational model of perceptual decision-making.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×