Abstract
Deep neural networks (DNNs), trained for object recognition, exhibit similarities to neural responses in the monkey visual cortex and are currently considered the best models of the primate visual system. It remains unclear whether psychophysical effects, such as illusory contours perceived by humans, also emerge in these models. Utilizing the invertibility properties of robustly trained feedforward neural networks, we demonstrated that illusory contours and shapes emerge when the network integrates its learned implicit priors. Our visual system is believed to store perceptual priors, with visual information learned and embedded in neural connections across all visual areas. This stored information is harnessed when required, for instance, during occlusion resolution or visual imagination generation. While the significance of feedback connections in these processes is well recognized, the precise neural mechanism that aggregates dispersed information throughout the visual cortex remains elusive. In this study, we leverage a ResNet50 neural network, conventionally used in image recognition, to shed light on the neural basis of illusory contour perception through its inherent feedback mechanism during error backpropagation. By iteratively accumulating the gradients of the loss with respect to an input—a Kanizsa Square—within an adversarially trained network, we observed the emergence of edge-like patterns in the area of the perceived 'white square'. This process, which unfolds over multiple iterations, echoes the time-dependent emergence of illusory contours in the visual cortices of rodents and primates as seen in experimental studies. Notably, the ResNet50 employed in this study was neither specifically enhanced with feedback capabilities nor optimized to detect or decode these illusory contours; it was merely trained for robust object recognition against adversarial examples. These findings highlight a compelling parallel, suggesting that the ability to perceive illusory contours might be an incidental consequence of the network's ability to handle adversarial noise during its training regime.