Abstract
In the last two decades rodents have been on the rise as a dominant model for visual neuroscience. This is particularly true for earlier levels of information processing, but high-profile papers using advanced behavioral paradigms have suggested that also higher levels of processing such as invariant object recognition occur in rodents. Nevertheless, it remains unknown whether and to what extent a further abstraction beyond primary visual cortex was required to perform these object recognition tasks. Here we provide a quantitative and comprehensive assessment of the claims of higher levels of processing by comparing a wide range of rodent behavioral and neural data with convolutional deep neural networks trained on object recognition.
We find that data from earlier studies meant to probe high-level vision in rodents can be explained by low to mid-level convolutional representations that fall short of the complexity of representations underlying object recognition in primates. For example, successful generalization to novel sizes and viewpoints of two rendered objects (Zoccolan et al., 2009) could already be captured by the first convolutional layer, but later layers explained more variance in stimulus-level performance patterns. On the other hand, later convolutional layers were required to capture generalization to novel natural video category exemplars (Vinken et al., 2014). Consistent with this finding, later convolutional layers matched the representational geometry increasingly better for extrastriate areas in the rat visual cortex. Our approach also reveals surprising insights on assumptions made before, for example, that the best performing animals would be the ones using the most complex representations (Djurdjevic et al., 2018) – which we show to likely be incorrect.
Overall, our findings support a mid-level complexity of rodent object vision and suggest a road ahead for further studies aiming at quantifying and establishing the richness of representations underlying information processing in animal models at large.