Abstract
Visual complexity is important for many aspects of perception, yet there is a lack of consensus over the best way to objectively measure it. Here we propose two computational methods for measuring perceived complexity of an image. The first method is based on Multiscale Entropy (MSE), which has mainly been used on time series. We decomposed the image into layers of various spatial scales, and computed Shannon entropy at each scale. The second method is based on PNG file size with rotations. We computed PNG file size at each 90° rotation of an image and took the minimal size, since the original PNG encoding does not account for spatial dimensions of the image. We collected subjective complexity ratings from Amazon Mechanical Turk for 6929 images from BOLD5000 and SUN, and included the clutter image set from Talia Konkle’s lab. Both the MSE method (Pearson’s r: BOLD5000 = 0.17; Scenes = 0.25; Clutter = 0.42) and the PNG method (Pearson’s r: BOLD5000 = 0.24; Scenes = 0.37; Clutter = 0.63) are positively correlated with human ratings. The PNG method with image rotation yields better performance than the PNG file size without rotation (Pearson’s r: BOLD5000 = 0.20; Scenes = 0.35; Clutter = 0.47), which is significant for clutter image set but not for BOLD5000 or Scenes. Furthermore, we performed a model-based regression analysis of the BOLD5000 fMRI data with human ratings, the MSE method and the PNG method as regressors. The computational models are associated with activation in V1 to V4, whereas human ratings are associated with activation in PPA and RSC. This finding supports the view that our algorithms only capture low-level features in images, and illustrates that higher-level information should be included in models to better match human ratings of perceived image complexity.