August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Can deep neural networks for intrinsic image decomposition model human lightness constancy?
Author Affiliations & Notes
  • Alban Flachot
    York University
  • Jaykishan Patel
    York University
  • Khushbu Patel
    York University
  • Tom S. A. Wallis
    TU Darmstadt
  • Marcus Brubaker
    York University
  • David H. Brainard
    University of Pennsylvania
  • Richard F. Murray
    York University
  • Footnotes
    Acknowledgements  Funded by a VISTA postdoctoral fellowship to Alban Flachot
Journal of Vision August 2023, Vol.23, 5467. doi:https://doi.org/10.1167/jov.23.9.5467
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Alban Flachot, Jaykishan Patel, Khushbu Patel, Tom S. A. Wallis, Marcus Brubaker, David H. Brainard, Richard F. Murray; Can deep neural networks for intrinsic image decomposition model human lightness constancy?. Journal of Vision 2023;23(9):5467. https://doi.org/10.1167/jov.23.9.5467.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

A challenge in vision science is understanding how the visual system parses the retinal image to represent intrinsic properties of scenes, such as surface reflectance and lighting. Deep learning networks have provided successful new approaches to inferring intrinsic images, and here we investigate these networks as models of human lightness constancy. We examined two state-of-the-art architectures for intrinsic image decomposition (Yu & Smith, 2019; Li et al., 2020), trained on photorealistic images of synthetic scenes. To compare network and human performance, we measured the networks’ estimates of surface reflectance using Mondrian patterns embedded in an indoor scene. A reference patch was shown under a fixed illuminant, and multiple test patches were shown under five different illumination levels. At each illumination level, we rendered 17 reflectance levels of the test patch and interpolated the networks' estimates for each to find a reflectance match to the reference, thus probing the networks’ lightness constancy. We repeated this procedure for three different reference reflectances. We also tested human observers in a corresponding lightness matching task, using the same stimuli presented with a virtual reality display. Human observers showed good lightness constancy, with an average constancy index (CI) of 0.81 across all stimuli. They were also consistent across conditions, with CI standard deviations around 0.10 across reflectance and lighting conditions. The deep learning networks, however, showed poor reflectance constancy, with an average CI of 0.19. Qualitative analysis suggests that the networks often misinterpreted lighting changes as reflectance changes. The networks were also less consistent than humans, with CI standard deviations of 0.34 and 0.21 across reflectance and lighting conditions, respectively. These results show that these deep learning networks do not fully model human lightness constancy. We will discuss potential strategies to address this shortcoming, as well as proposals for further benchmarking such networks.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×