Abstract
In order to optimize information utilization and prevent bottlenecking during visual processing, bottom-up information is triaged by selectively gating image features as they are observed. Here we demonstrate for the first time a biologically-plausible, information-theoretic model of the visual gating mechanism which works efficiently with natural images. From this, we give a neurophysiological preview of what image information is passing to higher levels of processing. We do this by processing information given in a natural image Rapid Serial Visual Presentation (RSVP) task by its spatio-temporal statistical surprise (Einhäuser,Mundhenk,Baldi,Koch & Itti,2007). From this, we obtain an attention-gate mask over each of the RSVP image frames derived from the map of attentional capture provided by the surprise system. The mask itself accounts for the degree to which distracter images that proceed or follow a target image are able to take attention away from it and vice versa. Attention is also accounted for within an image so that targets need to be salient both across frames and within the target image in order to be detected. Additionally, stronger target capture leads to better masking of rival information decreasing later visual competition. The surprise-based attention-gate is validated against the performance of eight observers. We find that 29 unique RSVP targets from 29 different sequences which are easy to detect precisely overlap to a far greater extent with open regions in the attention gate compared with 29 unique targets which are difficult to detect (P[[lt]].001). This means that when a target is easy to detect, more target regions are passing through the attention-gate increasing the availability of relevant features to visual recognition facilities. Additionally, this allows us to surmise what parts of any given image in an RSVP task can plausibly be detected since regions which are gated at this stage cannot be processed any further.