Abstract
Deep neural networks have recently revolutionized computer vision with their impressive performance on vision tasks. Their object representations have been found to match neural representations in the ventral pathway. But do deep neural networks see the way we do? This is an important question because it will elucidate the conditions and computations under which perceptual phenomena might arise in neural networks optimized for object classification. Here, we selected four perceptual phenomena in humans that we tested in a state-of-the-art deep neural network (VGG-16): these were Weber's law, Thatcher effect, Mirror confusion and Global advantage. By Weber's law, humans are more sensitive to relative than absolute changes in magnitude. Across VGG-16 layers, we measured pairwise distances between unit activations in response to lines differing in length or intensity. Early layers were sensitive to absolute differences, but the later fully connected layers became sensitive to relative differences in length and intensity. In the Thatcher effect, humans are more sensitive to part inversion in upright but not inverted faces. Again, later layers of a face neural network (VGG-face) exhibited greater sensitivity to upright face changes. In mirror confusion, humans find lateral mirror images more similar than horizontal mirror images. Here too, mirror confusion increased across layers of the neural network. Finally, we tested the global advantage effect, in which humans are more sensitive to changes in global shape compared to local shape. Here, deep neural networks showed the opposite trend: later layers were more sensitive to local changes in hierarchical stimuli. Thus, deep networks exhibit some but not all perceptual phenomena. We propose that closely comparing deep networks with human perception can bring interesting insights.
Meeting abstract presented at VSS 2018