Abstract
Introduction: Humans show lower search efficiency when finding a target discriminated from distractors only by the joint presence of two features (conjunctions) vs. defined by single features. The lower search efficiency for conjunctions is classically explained by serial attentional bottlenecks (Treisman and Gelade 1980; Wolfe 1989) or integration across two independent feature dimensions perturbed by noise (Eckstein, 1998). Here, we evaluate whether a feed-forward convolutional neural network shows feature/conjunction search efficiency dissociations. Methods: We separately trained multiple CNNs on two feature yes/no search tasks where the target (50 % present) differed from the distractors along orientation or contrast, or a conjunction task (orientation and contrast). Each image contained an independent sample of additive white noise. We manipulated set sizes with peripheral box cues and central cues indicating the possible target locations, and the physical presence of distractors. The network consisted of three convolution layers followed by a fully-connected layer, ending with an output layer with two neurons. Each network was trained on one task with one cue method leading to nine networks. Each network was trained five separate times and tested on 24,000 images at each set size. Results: CNN yes/no accuracy decreased with increasing set size for every task and set-size manipulation. Accuracy was lower for conjunctions vs. feature tasks with matched target-distractor feature physical differences and image noise amplitude. The set size effect was significantly larger for conjunction search than the two feature searches for the peripheral box (set-size differences: 3.3%, p<10e-5), central cues (2.9%, p<10e-7), and physical distractor manipulation (2.9%. p<10e-6). Conclusion: Convolutional Neural Networks trained to optimize accuracy result in lower search efficiency for conjunction vs. feature search revealing that the dissociation can be explained without any assumptions about serial bottlenecks or independent processing of features.