Abstract
Introduction: Clutter models (Feature Congestion; FS, Rosenholtz et. al 2005; Edge Density, ED, Mack & Oliva, 2004, and ProtoObject Segmentation, PS, Yu et. al, 2014) aim at generating a score from an image that correlates with performance in a visual task such as search. However, previous metrics do not take into account the interactions between the influences of clutter and the foveated nature of the human visual system. Here we incorporate foveated architectures to standard clutter models (Deza & Eckstein, 2016), and assess their ability (relative to unfoveated clutter metrics) to predict multiple eye movement search performance across images with varying clutter. Methods: Observers (n = 8) freely searched for a target (a person, yes/no task) with varying levels of clutter, small targets, and with a 50% probability of target present. Data Analysis: We correlated the clutter scores for images with the time to foveate a target (2 degree radius from target center). Results: We find that Feature Congestion (r=0.45 vs r_FoV=0.72, p< 0.05) and Edge Density (r=0.38 vs r_FoV=0.87, p< 0.05) benefit from inclusion of a foveated (Fov) architecture. ProtoObject Segmentation does not show such improvements. However, the unfoveated ProtoObject Segmentation model correlates just as high with human foveation time as all the other foveated versions: r=0.76 vs r_Fov = 0.38. The dissociation in results across the FC, ED and PS can be explained in terms of differences across models in the spatial density of the representations. ProtoObject Segmentation has spatially coarse intermediate representations leading to little effects from spatial pooling associated with a foveated architecture. Conclusion: Models with spatially dense representation pipelines can benefit from a foveated architecture when computing clutter metrics to predict time to foveate a target during search with complex scenes.
Meeting abstract presented at VSS 2017