Abstract
Perceptual grouping and selection are fundamental properties of visual perception, but their mechanisms remain poorly understood. Francis et al. (2017) and Kon and Francis (2020) proposed a neural network model that implements these properties. According to the model, a subject uses a particular grouping strategy that promotes performance on a given task and stimulus set. A grouping strategy consists of a connection strategy for connecting elements in a scene and a selection strategy that specifies the timing, location, and size of attentional spotlights. Building on this work, we apply the model to a visual enumeration task in Trick and Enns (1997). On each trial the task was to report the number of diamonds in an array, where there could be 1 to 8 diamonds and 0, 4 or 8 square distractors. The shapes were either drawn by lines or indicated by dots positioned at shape corners. In the dot shape condition, participants could only subitize (preattentive visual numerosity) when there were 1 to 3 targets and no distractors. Trick and Enns (1997) concluded that element clustering, i.e., the process of linking some elements (and not others) into units, is distinct from shape formation, i.e., the process of determining cluster shape. We modeled this task and results by identifying grouping strategies that closely match human performance for all distractor and shape conditions. Interestingly, the relatively flat slopes across mean response times for conditions with 1 to 3 targets and no distractors, which are regarded as indicative of subitizing, can be produced by this model even though it lacks a formal subitizing process. Additionally, the identified strategies indicate that the dot shapes require something like element clustering, i.e., connections, to be efficiently selected and counted. Thus, the model supports the claim that a distinct element clustering process is involved in this task.