Abstract
What key factors of deep neural networks (DNNs) account for their representational similarity to visual cortex? Many properties that neuroscientists proposed to be critical, such as architecture or training task, have turned out to have surprisingly little explanatory power. Instead, there appears to be a high degree of “degeneracy,” as many DNNs with distinct designs yield equally good models of visual cortex. Here, we suggest that a more global perspective is needed to understand the relationship between DNNs and the brain. We reasoned that the most essential visual representations are general-purpose and thus naturally emerge from systems with diverse architectures or neuroanatomies. This leads to a specific hypothesis: it should be possible to identify a set of canonical dimensions, extensively learned by many DNNs, that best explain cortical visual representations. To test this hypothesis, we developed a novel metric, called canonical strength, that quantifies the degree to which a representational feature in a DNN can be observed in the latent space of many other DNNs with varied construction. We computed this metric for every principal component (PC) from a large and diverse population of trained DNN layers. Our analysis showed a strong positive association between a dimension’s canonical strength and its representational similarity to both human and macaque visual cortices. Furthermore, we found that the representational similarity between visual cortex and the PCs of a DNN layer, or any set of orthogonal DNN dimensions, is well predicted by the simple summation of their canonical strengths. These results support our theory that canonical visual representations extensively emerge across brains and machines – suggesting that “degeneracy” is, in fact, a signature of broadly useful visual representations.