Abstract
What drives image memorability? Previous work has focused almost exclusively on either specific dimensions of the images themselves or on their local categories. This work has typically involved semantic properties (e.g. whether an image contains a human, or whether it is an instance of a forest), but we have also demonstrated visual memorability: even when semantic content is eliminated (e.g. via phase-scrambling), some images are still consistently more likely to be retained in short-term memory. Beyond individual feature dimensions and image categories, here we ask whether the memorability of an image is also influenced by its distinctiveness in a much broader multidimensional feature space. We first measured distinctiveness behaviorally, by calculating the conceptual and perceptual distinctiveness of images based on pairwise similarity judgments of each type. We then also measured distinctiveness computationally, by calculating the average distance of each target image to all other images in a ~10,000-image database, at different layers of a CNN trained to recognize scenes and objects (VGG16-Hybrid1365). For intact vs. scrambled images, we observed opposite patterns of correlations between distinctiveness and short-term memorability. For intact images, short-term memorability was primarily a function of how conceptually distinct an image was. And strikingly, this was mirrored in the CNN analysis: distinctiveness at later (but not earlier) layers predicted memorability. For scrambled images, in contrast, the reverse was true. Collectively, these results suggest that memorability is a function of distinctiveness in a multidimensional image space — with some images being memorable because of their conceptual features, and others because of their perceptual features. Moreover, because distinctiveness in the CNN was computed over all images in the much larger database, the relevant measure of distinctiveness may reflect not just the local statistics of an experimental image set, but also the broader statistics of our natural environment as a whole.
Acknowledgement: SRY was supported by an NSF Graduate Research Fellowship. BJS was supported by ONR MURI #N00014-16-1-2007.