Abstract
Inspired by recent computer vision models for object recognition in clutter, we are developing a model of human object recognition based on local, distinctive fragments. The first stage of such models typically involves the selection of a large pool of potential image fragments using an interest point detector. In subsequent stages, this large pool is reduced to a smaller set of distinctive fragments. In developing a model for humans, our first step has been to determine whether the pool of fragments selected by the most common interest point detector, the Harris Detector (HD), includes the fragments humans find distinctive. Our test images were randomly rotated photographs of 12 common tools. We applied an HD to these images and collected fragments with a wide range of interest ratings. The scale of the HD determined the size of the fragments (8-pixel radius, 1–2% of the whole object). These fragments were then used as the stimuli in a recognition experiment. After a brief training period with whole tools, observers identified the tool fragments. Overall, observers were remarkably good at recognizing these tiny fragments. We then compared the recognition results with the interest ratings of the HD. Many fragments that were recognizable to observers were not given high interest ratings by the HD, which responds best to locations with large luminance gradients in multiple directions (e.g., corners). In addition to recognizing such fragments, observers also recognized fragments with subtle or one-dimensional gradients.