Abstract
Core object recognition refers to the ability to rapidly recognize objects in natural scenes across identity-preserving transformations, such as variation in perspective, size or lighting. In laboratory object recognition tasks using 2D images, adults and Convolutional Neural Networks (CNNs) perform close to ceiling. However, while current CNNs perform poorly on distorted images, adults' performance is robust against a wide range of distortions. It remains an open question whether this robustness is the result of superior information representation and processing in the human brain, or due to extensive experience (training) with distorted visual input during childhood. In case of the latter, we would expect robustness to be low in early childhood and increase with age. Here we investigated the developmental trajectory of core object recognition robustness. We first evaluated children's and adults' object classification performance on undistorted images and then systematically tested how recognition accuracy degrades when images are distorted by salt-and-pepper noise, eidolons, or texture-shape conflicts. Based on 22,000 psychophysical trials collected in children aged 4–15 years, our results show that: First, while overall performance improves with age, already the youngest children showed remarkable robustness and outperformed standard CNNs on moderately distorted images. Second, weaker overall performance in younger children is due to weak performance on a small subset of image categories, not reduced performance across all categories. Third, when recognizing objects, children—like adults but unlike standard CNNs—heavily rely on shape but not on texture cues. Our results suggest that robustness emerges early in the developmental trajectory of human object recognition and is already in place by the age of four. The robustness gap between humans and standard CNNs thus cannot be explained by a mere accumulation of experience with distorted visual input, and is more likely explained by a difference in visual information representation and processing.