Abstract
The adage “a picture is worth a thousand words” underscores the notion that visual information conveys rich meaning. However, not all scenes contain equal semantic depth. This study quantified the semantic complexity of images and assessed its implications for early visual processing. We asked 100 online observers to write image descriptions for a previously-used set of 1000 images (Bainbridge & Baker, 2020; Greene & Trivedi, 2023). A composite semantic complexity score was computed from the median word count, variability among descriptions (entropy in a bag of words model), and average pairwise distance between concepts within a description from a word vector model (Word2Vec). We selected 100 images with the highest semantic complexity scores and 100 images with the lowest semantic complexity for a rapid detection experiment. The two image groups did not differ significantly in several measures of visual complexity. We predicted that images with lower semantic complexity convey less information, thus observers would more quickly and accurately detect such images. Observers (N=38) distinguished between scene images and 1/f noise (SOA: ~60 ms, with a dynamic pattern mask. Contrary to our expectations, observers had higher detection sensitivity for images with greater lexical complexity (d’: 3.91 vs 3.58, p<0.005). This finding challenges the common expectation of capacity limitations in the face of stimulus complexity. Instead, it suggests that semantic richness may enhance rapid perception. One interpretation is that a more extensive set of contextual associations increases both semantic complexity and visual detectability. Alternatively, richer semantic content may engage top-down processing more effectively, aiding rapid visual detection. These results challenge typical views of cognitive load and point to highly semantic aspects of scene gist that drive early visual detection.