Abstract
Recent studies propose that mid-level features could underlie animacy perception. Here we tested whether the ability to estimate ensemble summary statistics for animacy based on high or mid-level features. We used four types of animate and inanimate images (Long & Konkle, 2018): colorful images, greyscale images, silhouettes (contains only recognizable shapes), texforms (unrecognizable images, which preserve mid-level texture and shape information). In Exp.1 we asked participants to evaluate the animacy of single images and of sets of eight images of one type using a 10-point scale. In Exp.2, two sets of eight images of the same type were shown on the left and right parts of the screen and participants had to choose more animate one (2AFC task). We manipulated the animacy by changing the number of animate images in the set (Exp.1) or the number of animate images by which the two sets differed (Exp.2). In Exp.1, we found strong correlations between animacy ratings for sets and the number of animate objects in the set for all types of images. In Exp.2, we found that even when two sets differed only by one image the percentage of correct answers was higher than the guess rate. Despite weaker correlations for texfoms in Exp.1 and a lower percentage of the correct answer for texforms in Exp.2, participants were still able to report the mean animacy of the set. Also, participants categorize individual texforms as animate less confidently. We assume that the general deterioration of results for texforms in all tasks was connected with the noisier nature of texforms compared with other types of images. Thus, we suggest that the ensemble representation of animacy could be explained not only by high-level features but also by mid-level features, such as shape and texture.