Abstract
It has been proposed that the human brain learns to exploit the statistics of the environment and extract task-relevant features for successful recognition performance. As a result, early visual processing appears to be optimized to encode and represent the statistical properties of sensory inputs. Blur is an important stimulus dimension and one of the most common perceptual consequences following visual impairment. Here we ask how blur alters task-relevant features for face and letter recognition and how the visual system adapts to degraded sensory inputs. To this end, we used deep unsupervised generative models, VAEs, trained to reconstruct input images of letter or face under various blur levels ranging from normal vision (20/20 acuity) to severe blur (20/700 acuity), from their latent representations. Comparing the early layer of the networks trained under different levels of blur showed that on average the size of the receptive fields under severe blur increased by up to 60%, compared to that of no blur. Moreover, our results indicated that while for no blur, the representational dissimilarity matrices showed increased dissimilarities among different classes of letters or faces (e.g., gender or age) as we go from lower to the higher layers, in blurred viewing, the increase in the dissimilarity patterns was not observed until the later stages suggesting that a larger integration zone is required for recognition under blur. Additionally, even at the latest stage, the relative distance among different classes in the latent feature space was reduced for blurred images which can explain the lower recognition accuracy under blur. Using activation maps, we also identified critical features corresponding to different classes. Our findings help us pinpoint the extent to which human object recognition is limited by the information content of the stimulus and identify which task/stimulus attributes are likely to remain invariant under degraded viewing.