September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Bayesian interpretation of artificial neural network models in perception
Author Affiliations & Notes
  • Cheng Qiu
    University of Pennsylvania
  • Alan Stocker
    University of Pennsylvania
  • Footnotes
    Acknowledgements  NSF grant IIS-1912232
Journal of Vision September 2021, Vol.21, 2712. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Cheng Qiu, Alan Stocker; Bayesian interpretation of artificial neural network models in perception. Journal of Vision 2021;21(9):2712.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Bayesian observer models and artificial neural networks (ANNs) in computer vision operate on the same general premise that vision reflects optimal behavior. Thus, a Bayesian interpretation of ANNs could provide intuitive understanding of the networks' computational properties as well as insights into how Bayesian computations can emerge through algorithmic learning. We explored such an interpretation for a recently proposed ANN model of motion perception (Rideaux/Welchman, 2020). The network, trained to identify translational motion categories of natural images, showed similar perceptual biases toward slow speeds as has been observed for human subjects. The authors note, however, that because the distribution of training samples was uniform across all motion categories, these biases cannot be due to a slow-speed prior. We demonstrate that the geometry of the feature space is crucial for making a correct Bayesian interpretation. Although the distribution of training samples was uniform across categories, it did not correspond to a uniform distribution in 2D velocity space because the chosen categories were not equidistant in this space. Similarly, the categorical loss function (cross-entropy) more strongly penalized low-speed errors as the categories were closer at low speeds. Both aspects led to an over-representation of slow-speed motion, thus effectively embedding a slow-speed prior. We show that by correctly accounting for the geometry of the feature space, the ANN estimated speeds are in agreement with predictions from a Bayesian observer model. Furthermore, we show that the amount of sensory uncertainty depends on the architecture of the network (i.e. its resource), e.g., the kernel size of the convolutional layer determines the likelihoods of the motion stimuli. Together, our results show that ANN models of perception can be interpreted as an algorithmic implementation of a Bayesian inference process given resource constraints and the proper combination of prior, likelihood, and loss structure of the task.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.