Purchase this article with an account.
Gregory Zelinsky, Yifan Peng, Alexander Berg, Dimitris Samaras; Modeling Guidance and Recognition in Categorical Search: Bridging Human and Computer Object Detection. Journal of Vision 2012;12(9):957. doi: https://doi.org/10.1167/12.9.957.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Search consists of a process that compares a target representation to blurred patterns in the visual periphery for the purpose of generating a guidance signal, and a recognition process that verifies targets or rejects distractors, usually after fixation by gaze. Do these component search tasks use the same visual features? We addressed this question by training several SVM-based classifiers to describe both behaviors. Observers did a present/absent categorical search for a teddy bear target in four-object arrays. Target-absent trials consisted of random category objects ranked as visually target-similar, target-dissimilar, or medium-similarity, as described in Alexander & Zelinsky (2011, JOV). Accuracy was high on target-present (95.3%) and target-absent (97.9%) trials, and guidance was quantified as the object first fixated during search. First fixations were most common on targets (79.4%), followed by target-similar (65.5%), medium (12.4%), and target-dissimilar (5.7%) distractors. Bear/non-bear classifiers were trained using features ranging in biological plausibility (V1, C2, SIFT-BOW, SIFT-SPM), with each feature tested separately and in combination with color. Training and testing was done on non-blurred and blurred versions of objects in separate conditions. Objects were blurred using TAM (Zelinsky, 2008, Psychological Review), which approximated an object's appearance in peripheral vision before the initial eye movement. Accuracy and guidance patterns were modeled almost perfectly by an SVM using C2 and color features; a simple and biologically-plausible feature outperformed state-of-the-art computer vision features in a head-to-head comparison. These results were obtained by training on non-blurred objects and testing on blurred objects (pre-fixation viewing conditions existing at guidance) and non-blurred objects (post-fixation viewing conditions existing at recognition)—training on blurred objects resulted in worse guidance fits. Moreover, adding color to the C2 feature improved fits to both guidance and recognition, although SVM used color more for guidance. Despite different viewing conditions, these findings suggest that search guidance and recognition may use the same features, albeit with different weightings, and may therefore be more related than previously believed.
Meeting abstract presented at VSS 2012
This PDF is available to Subscribers Only