Journal of Vision Cover Image for Volume 21, Issue 9
September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Embracing New Techniques in Deep Learning for Predicting Image Memorability
Author Affiliations
  • Coen Needell
    University of Chicago
  • Wilma Bainbridge
    University of Chicago
Journal of Vision September 2021, Vol.21, 1921. doi:https://doi.org/10.1167/jov.21.9.1921
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Coen Needell, Wilma Bainbridge; Embracing New Techniques in Deep Learning for Predicting Image Memorability. Journal of Vision 2021;21(9):1921. https://doi.org/10.1167/jov.21.9.1921.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Various work has suggested that the memorability of an image is consistent across people, and thus can be treated as an intrinsic property of an image. Using computer vision models, we can make specific predictions about what people will remember or forget. While older work used a now-outdated deep learning architecture to predict image memorability, innovations in the field have given us new techniques to apply to this problem. Here, we propose and evaluate five alternative deep learning models to MemNet which exploit developments in the field from the last five years, largely the introduction of residual neural networks. We also evaluate the pre-existing implementation of MemNet on a broader set of images. Five new models with architectural differences were implemented and tested on a mixture of MemNet’s original training set, LaMem, and a recent dataset, MemCat. LaMem is a large database of objects and scenes, many of which are designed to have high memorability. MemCat complements this, with a large number of exemplars in object categories. The new models all utilize residual neural networks, which are intended to mimic the structure of pyramidal cells with skip connections, in their feature extraction stages, allowing the model to use semantic information in the memorability estimation process. The most complex model also utilizes semantic segmentation, which ascribes a semantic category to each pixel. Our findings suggest that the original paper overstated MemNet’s generalizability and MemNet likely was overfitting on LaMem. Our new models outperform MemNet, all achieving similar scores to one another, but when allowed to retrain the semantic segmentation based model outperforms the rest. This information leads us to conclude that Residual Networks outperform simpler convolutional neural networks in memorability regression, which will in turn improve memory researchers’ ability to make predictions about memorability on a wider range of images.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×