Purchase this article with an account.
Aditya Jonnalagadda, Arturo Deza, Miguel Eckstein; A foveated object detector that misses giant and misplaced targets in scenes. Journal of Vision 2018;18(10):3. doi: https://doi.org/10.1167/18.10.3.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Introduction: Scene context influences human eye movements and search performance (Chen & Zelinsky, 2006). Models have utilized contextual information to predict human eye movements with real scenes (Torralba et al., 2006; Eckstein et al., 2006) or improve computer vision (Choi et al., 2012), but such models are not foveated and do not explore the scene with eye movements to generate perceptual decisions. Here, we propose a foveated object detector (Akbas & Eckstein, 2017) that utilizes object relationships to search for targets in real scenes and show a number of classic and newly reported effects of context on human eye movements and perceptual decisions. Methods: Humans and the object detector searched for a computer mouse placed at different locations on desks with distracting objects (50 % target presence). The object detector utilized a foveated visual field (Freeman & Simoncelli, 2011), retino-specific classifiers, and executed eye movements to the most likely target location (maximum a posteriori probability with inhibition of return). The model utilized context by using a separate training data set to estimate conditional probabilities of the size and location of the mouse relative to other objects in the scene and incorporated that information to make eye movements and reach decisions. Results: Both human and the foveated object detector showed similar effects: (a) Targets with inconsistent spatial scale or atypical locations within the scene were missed more often and foveated later; (b) Distractor objects (e.g. cell phone) were foveated and misclassified as the target more often when placed at the expected location of the computer mouse. A model that did not utilize context showed no human-like context effects. Conclusions: A foveated object detector with a probabilistic model of object relationships can capture contextual effects on human search with real scenes without invoking a limited resource covert attentional mechanism.
Meeting abstract presented at VSS 2018
This PDF is available to Subscribers Only