Abstract
One of the primary goals of the visual system is to extract statistical regularities from the environment to build a robust representation of the world. Recent research on visual statistical learning (VSL) has demonstrated that observers can implicitly extract joint probabilities between objects during streams of visual stimuli (Fiser & Aslin, 2002). Typically, these VSL studies include the same stimuli throughout training and test. In the real world, though, temporal predictability exists at both exemplar and categorical levels: whatever office you are in, the probability that you will step out in a zoo is much lower than the probability that you will enter a corridor. Here, we tested to what extent people are sensitive to the learning of categorical temporal regularities based on the gist of natural scenes.
Observers performed a one-back task, viewing a 7-minute familiarization sequence of pictures drawn from 12 scene categories (mountain, kitchen, street, etc). Other than the one-back repeats, none of the same pictures were ever repeated. During the stream, the joint probability between triplets of scene categories was manipulated (for instance, a bathroom would always precede a mountain which would always precede a forest). After familiarization, observers completed a series of 2AFC familiarity judgments between triplets of novel pictures that either maintained or ignored the temporal regularities present during the learning phase. Results showed that observers more often choose the sets of pictures whose categories had predictably followed one another. Importantly, nothing about scene category was mentioned to the observers, and none of the tasks required extracting the semantic category of the scenes. The results suggest that the gist of a scene is automatically extracted even when it is not task-relevant, and that implicit statistical learning can occur at a level as abstract as the conceptual gist representation.