Abstract
We study how subjective perception of symmetry can be computationally explained by features at different levels. We select 149 images with varying degrees of symmetry from photographs and movie frames and collect responses from 200 subjects. Each subject is shown 50 random images and asked to rate each image with one of four options: Not symmetric, Somewhat symmetric, Symmetric, and Highly symmetric. We measure the bilateral symmetry of an image by comparing CNN features across multiple levels between two vertical halves of an image. We use the AlexNet model pre-trained on the ImageNet dataset for extracting feature maps at all 5 convolutional layers. The extracted feature maps of the two bilateral halves are then compared to one another at different layers and spatial levels. The degree of similarity on different feature maps can then be used to model the range of symmetry an image can be seen to have. We train a multiclass SVM classifier to predict one of the four symmetry judgements based on these multi-level CNN symmetry scores. Our symmetry classifier has a very low accuracy when it needs to predict all observers' responses equally well on individual images. However, our classification accuracies increase dramatically when each observer is modeled separately. Our results suggest that symmetry is in fact in the eye of the beholder: While some observers focus on high-level object semantics, others prefer low or mid-level features in their symmetry assessment.
Meeting abstract presented at VSS 2017