Purchase this article with an account.
Konrad E Prokott, Roland W Fleming; Predicting Human Perception of Glossy Highlights using Neural Networks. Journal of Vision 2019;19(10):297b. doi: https://doi.org/10.1167/19.10.297b.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Human observers easily distinguish glossy from matte materials. Glossy materials reflect their surroundings, and exhibit distinctive specular highlights. The importance of highlights for gloss perception has been demonstrated by their use for centuries in the visual arts, and by the observation that removing highlights from photographs leads to a matte surface appearance. However, the visual computations underlying gloss perception remain largely unsolved. Here, we investigated how the visual system identifies specular highlights in images. This is challenging, because a given bright spot in the image could be a surface texture marking, light source, caustic, or many other physical events. Somehow the visual system has to identify that the bright spot is due to specular reflection, and then propagate this interpretation to surface regions where there is no local evidence that the surface is glossy. To test participants’ ability of identifying highlights we showed them computer renderings of glossy textured surfaces. Participants were asked to judge whether a given location in the image was a highlight or a texture marking. The results indicate that participants are excellent at this task, but that there are occasional consistent errors. We then compared the observers’ judgements to several models, ranging from a simple intensity threshold to more complex neural networks trained to give pixel-wise output maps of the specular reflectance component of an image. Our results show that human responses can be well matched by a relatively shallow feed-forward convolution neural network. We then compared model predictions to human responses on more challenging images in which the highlights are shown in the wrong locations and orientations relative to the matte components. Investigating the internal representations of the best models reveals a number of image measurements that could be the basis of human judgments.
This PDF is available to Subscribers Only