Abstract
Unlike artificial neural networks (ANNs), human object recognition is robust in degraded conditions. Accumulating evidence suggests that recurrent connections within the ventral visual stream are necessary in such conditions. Indeed, incorporating recurrence within ANNs significantly improves their performance. Nevertheless, despite the success of recurrent ANNs over purely feedforward ones, the recognition abilities of these models on degraded objects lags far behind that of human adults. Why is the human visual system impervious to such conditions? In a novel approach to answering this question, we compared the recognition abilities of state-of-the-art ANNs to 4- and 5-year-old children. Although children show impressive object recognition abilities, it remains unknown how robust these abilities are when objects are degraded or under speeded conditions. Children (N = 84) were tested on a challenging object recognition task which required them to identify rapidly presented object outlines (100 ms - 300 ms; forward and backward masked) that had perturbed or illusory contours. We found that even the youngest children successfully identified both perturbed and illusory outlines at the fastest speeds, even though objects were both forward and backward masked. By contrast, neither a feedforward model (VGG19), nor a model that approximates recurrence (ResNet101), showed comparable performance to children. Thus, despite receiving exponentially more supervised object training than children (Zador, 2019), ANNs fall short of the recognition abilities of children. We suggest that, from early in development, robust object recognition in humans may be supported by parallel feedforward processes in the dorsal stream, in addition to recurrent processes in ventral stream.