Abstract
One way to investigate the contribution of cognition on activity in the visual cortex is to fix or remove the retinal input altogether. There are many such non-optic visual experiences to draw from (e.g., mental imagery, synesthesia, hallucinations), all of which produce brain activity patterns consistent with the visual content of the experience. But how does the visual system manage to both accurately represent the external world and synthesize visual experiences? We approach this question by expanding on a theory that the human visual system embodies a probabilistic generative model of the visual world. We propose that retinal vision is just one form of inference that this internal model can support, and that activity in visual cortex observed in the absence of retinal stimulation can be interpreted as the most probable consequence unpacked from imagined, remembered, or otherwise assumed causes. When applied to mental imagery, this theory predicts that the encoding of imagined stimuli in low-level visual areas will resemble the encoding of seen stimuli in higher areas. We confirmed this prediction by estimating imagery encoding models from brain activity measured while subjects imagined complex visual stimuli accompanied by unchanging retinal input. In a different fMRI study, we investigated another far rarer form of non-optic vision: a case subject who, after losing their sight to retinal degeneration, now “sees” objects they touch or hear. The existence of this phenomenon further supports visual perception being a generative process that depends as much on top-down inference as on retinal input.