Abstract
We encounter various materials every day (e.g., plastics, soaps, and stones), and often need to estimate attributes of novel materials in new environments. How do humans judge material properties across different categories of materials, which possess common but also distinctive visual characteristics? We hypothesize that the features humans use to perceive materials might be overlapping, although different materials have their peculiar visual characteristics pertinent to their physical properties. We previously demonstrated the unsupervised image synthesis model, StyleGAN, can generate perceptually convincing images of translucent objects (e.g., soaps). Here, using transfer learning, we test whether the model pretrained on a soap dataset provides critical information to learn a different material, crystals. Specifically, we transfer the pre-trained StyleGAN from a large soap dataset to a relatively small crystal dataset, via full model fine-tuning. With little training time, we obtain a new generator that synthesizes realistic crystals. The layer-wise latent spaces of both material domains show similar scale-specific semantic representations: early-layers represent coarse spatial-scale features (e.g., the object’s contour), middle-layers represent mid-level spatial-scale features (e.g., material), and later-layers represent fine spatial-scale features (e.g., color). Notably, by swapping the weights between the soap and crystal generators, we find that the multiscale generative processes of the materials (spanning from 4-by-4 to 1024-by-1024 resolution) mostly differ in the coarse spatial-scale convolution layers. Convolution weights at the 32-by-32 resolution generative stage determine the critical difference in the geometry of the two materials. Convolution weights at 64-by-64 resolution decode the common characteristics (e.g., translucency) between the materials. Moreover, without additional training, we could create new material appearances that have visual features from both training categories. Together, we show that there are overlapping latent image features among distinctive material categories, and that learning features from one material benefits learning a new material.