Abstract
Machine learning algorithms can now rival human categorization performance. Such performance typically relies on training with millions of images, with little to no effort to optimize exemplars. Yet, humans consider some images more representative of their category than others. Are images that humans rate as highly representative better training examples than those they rate as less representative? To test this hypothesis, we adapted the method of Nartker et al. (2020), in which each pair of exemplars from two out of three categories (beaches, cities, highways) was used to train one support vector machine (SVM), and this SVM was tested on the rest of (left out) exemplars from the same two categories. The images were rated by humans for how representativeness they were of their category (Torralbo et al., 2013). Each category contained the 60 most highly rated (good exemplars) and the 60 least highly rated (bad exemplars) images. There were modest but significant differences on overall classification of the testing images depending on whether the one shot SVMs were trained on two good exemplars (54%) or two bad exemplars (51%), t(21597)=41.9, p<.001. However, differences were more dramatic when looking at the top (66%) and the worse (36%) performing pairs. Across all category pairings, 98% of the 60 top-performing pairs included at least one good exemplar, whereas 78% of the 60 worse-performing pairs included at least one bad exemplar. The top-performing pairs correctly classified not only more good exemplars (36%) but also more bad exemplars (30%) than the worse-performing pairs did (15%, t(314)=79.3, p<.001, and 21%, t(355)=64.1, p<.001, respectively). This suggests that training on good exemplar results in better classification across the board, rather than reflecting similarities among the good exemplars. Overall, these data suggest that human curated and appropriately chosen exemplars can improve even the simplest machine learning protocols.