Abstract
Many visual illusions are contextual by nature. In the orientation-tilt illusion, a central grating’s orientation is perceived as being repulsed from or attracted to the orientation of a surrounding grating. An open question for vision science is whether such illusions reflect basic limitations of the visual system, or whether they correspond to corner cases of neural computations that are efficient in everyday settings. Our starting point to investigate the computational role of the visual tilt illusion is the neural circuit model of classical (CRF) and extra-classical receptive fields (eCRFs) by Mely et al (2018). The model was constrained by anatomical and physiological data and shown to be consistent with a host of contextual illusions spanning visual modalities. We developed a machine-learning approximation of the circuit, which we call the feedback gated recurrent unit (fGRU), resulting in a recurrent circuit that can be embedded in modern deep neural network architectures. Unlike the original circuit, the fGRU implements hierarchical contextual interactions through task-optimized horizontal (within a layer) and/or top-down connections (between layers). We trained fGRU networks for the detection of object contours in natural scenes. They were found to be more sample efficient than state-of-the-art deep neural network models, while also exhibiting an orientation-tilt illusion consistent with human perception. Correcting this illusion significantly reduced model performance, driving a preference towards low-level edges over high-level object boundaries. Overall, the present work provides direct evidence that the tilt illusion is a feature, not a bug, of neural computations optimized for contour detection.