Abstract
Translucent materials have a wide variety of appearances due to variability of scattering, geometry, and lighting condition. Most previous studies used rendered images as stimuli. However, it is challenging to acquire accurate physical parameters to render materials with realistic appearance. Here, we investigate to what extent Generative Adversarial Networks (GANs) trained on unlabeled photographs of translucent materials can produce perceptually realistic images and achieve diverse appearances, without knowing about the physical parameters. We created a dataset of 3000 photographs of soaps with a variety of translucent appearances. The images were trained with StyleGAN2-ADA, a generative network that is trained on limited data with data-augmentation. We then conducted human psychophysical experiments to measure the perceived quality of the model’s output. In Experiment 1, we sampled 250 images from the real photographs and another 250 from the generated images, and asked observers to judge whether the soap in the image is real or fake after a brief 300ms presentation. In Experiment 2, observers rated the level of translucency of the material for the same 500 images on a 5-point-scale. Ten observers sequentially completed both experiments. We find observers can correctly judge the vast majority of real photographs (73% of the real images are correctly judged by at least 9 observers) but they make substantial mistakes for the generated images (60% fake images are falsely judged to be real by at least 2 observers and 7% fake images are falsely judged by more than 5 observers). Second, observers can discriminate a range of translucency from the generated images from opaque to transparent, similar to that of the real photographs. Our results suggest StyleGAN2-ADA has the potential to learn a representation of translucent appearances similar to that of humans, and it is useful to explore its latent space to disentangle the material-related features.