December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Simultaneous Localization and Size Discrimination Modeling via Convolutional Neural Network
Author Affiliations
  • Rina Lu
    University of California, Berkeley
  • Zhihang Ren
    University of California, Berkeley
  • Zixuan Wang
    University of California, Berkeley
  • Stella X. Yu
    University of California, Berkeley
  • David Whitney
    University of California, Berkeley
Journal of Vision December 2022, Vol.22, 4460. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Rina Lu, Zhihang Ren, Zixuan Wang, Stella X. Yu, David Whitney; Simultaneous Localization and Size Discrimination Modeling via Convolutional Neural Network. Journal of Vision 2022;22(14):4460.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Among the many models of perception that have been proposed, most visual tasks are treated independently (Maximilian et al., 2000; Ma et al., 2011). However, the human visual system is an interconnected hierarchical network, with many neurons in the visual cortex shared to process various visual features, which are then utilized by downstream processes. Inspired by this characteristic of the human visual system, we propose a novel way to simplify models that can be generalized across different visual tasks: by sharing the feature encoder. We tested this new model framework based on the findings of some recent work in psychophysics showing that localization and perceived size of human observers are highly correlated (Wang et al., 2020). In this study, we used a Convolutional Neural Network to model localization and size perception tasks simultaneously. The localization task was to report the location of briefly-presented noise patches, and the size perception task was to discriminate whether an arc shown on the screen was shorter or longer than the average length of all seen arcs. Unlike traditional multi-task neural networks, where the inputs are the same for different tasks, our model can tolerate different types of visual stimuli. Our model is composed of one shared feature encoder, one regressor for localization, and one classifier for size discrimination. During training, the encoder and the regressor are trained first under the localization task, then the classifier for the size discrimination task is fine-tuned. Surprisingly, our results revealed that even though the encoder was never trained for the size discrimination task and the appearance of the visual stimuli across the two tasks was distinct, our model exceeded human performance on both tasks. The approach here provides a possible way to simplify multi-task computational models with shared features and provides insight into joint modeling of visual processes.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.