Abstract
A recent study claimed that the initially poor visual acuity of infants might be critical for the visual system to learn to integrate information over larger spatial scales. Specifically, when convolutional neural networks (CNNs) were initially trained on blurry face images, followed by progressively clearer images, the CNN acquired robustness to variations in spatial resolution (Vogelsang et al., 2018). Here, we asked whether initial training with blurry images would confer a similar benefit to general object recognition, possibly offering a clue to the robust nature of human vision. To evaluate this question, we trained AlexNet with 1,000 ImageNet object categories in which the resolution of inputs was gradually increased over training epochs. Although training with blurry objects initially led to good performance on blurry test images, such robustness to blur soon disappeared after training with clear images. These findings deviated from the robustness to blur that occurred for a CNN trained with face images. Curiously however, we found that a CNN trained concurrently on faces and objects, progressing from blurry to clear, lost the ability to recognize both blurry faces or objects. This problem of catastrophic forgetting can be attributed to the fact that proportionally greater discriminating information resides in the higher spatial frequencies of objects as compared to faces. As the CNN learns to leverage this finer scale information, it tends to lose the ability to leverage information at coarser spatial scales for both objects and faces. While our findings do not rule out the possibility that poor initial acuity might lead to important developmental benefits in face recognition ability, they do bring into question whether people’s ability to recognize blurry faces and objects in adulthood can be explained by these early experiences alone.