Abstract
Visual representations learned by convolutional neural networks (CNNs) share some similarity in representational structure to neural representations in the primate ventral visual stream (e.g. Yamins et al., 2014). However, the organization of low-level feature representations by CNNs has not been extensively characterized. Understanding whether CNNs develop idiosyncrasies that mimic the properties of the primate visual system is important for developing models that can inform our understanding of the brain. Additionally, because many aspects of CNN representations are acquired through training, examining feature representations of CNNs is a useful tool for determining which properties of the primate brain might be innate and which are likely to be acquired through experience. Here, we focus on orientation perception, a well-understood aspect of the primate visual system. We asked whether convolutional neural networks trained to perform object recognition on a natural image database would exhibit an “oblique effect” such that cardinal (vertical and horizontal) orientations are represented with higher precision than oblique (diagonal) orientations, as has been measured in the brain and behavior of primates. We obtained activation patterns from a pre-trained VGG-16 network (Simonyan & Zisserman, 2014) presented with oriented grating stimuli, and used a Euclidean distance metric to measure the discriminability between patterns corresponding to different pairs of orientations. In agreement with human perception, we find that orientation discriminability generally peaked around the cardinal orientations. This effect emerged at middle layers of the VGG-16 network. Its magnitude increased with stimulus spatial frequency, but decreased with stimulus uncertainty. We also trained networks from scratch using images from the ImageNet database (Deng et al., 2009) that had been rotated by varying increments. Overall, our findings suggest that cardinality effects in human visual perception are not dependent on a hard-wired anatomical bias, but can instead emerge through experience with the statistics of natural images.