Abstract
The analysis of visual stimuli is widely believed to follow a coarse-to-fine time course. This theory predicts that when humans identify groups of letters, the identification error rate and the pattern of errors (e.g. errors consistent with the global shape or the fine details of letters) would be different for short vs. long stimulus presentation durations. We tested these predictions by comparing the error rate and the pattern of the errors (based on confusion matrices corrected for observer bias) made by three observers who identified all letters presented in trigrams (sequences of three random lowercase letters, x-height=1.2°, letter separation=2.4°), for stimulus exposure durations of 50 and 200 ms. The center of each trigram was presented at 10° in the lower visual field. To examine the effect of spatial scale, letters were filtered using 1-octave raised-cosine log filters centered at 1.35 or 5 c/letter. Testing was also performed for unfiltered letters. 1300 trials were collected for each observer and for each condition. Contrary to our prediction, error rates (overall, or for each letter position) were highly similar between the two spatial-frequency conditions, which were 3–4x higher than the corresponding error rates for the unfiltered condition. Error rates were also higher for the 50-ms than the 200-ms condition. Interestingly, proportion of mislocation errors were ~2x higher for the unfiltered than for the two filtered conditions. When trials with mislocation errors were excluded, the identification errors made were consistent with observers relying on the global shape of letters for making judgment (e.g. confusions among round letters {aceos}), regardless of the spatial-frequency content or the duration. The similar error rate, and the similar pattern of identification errors made for the two spatial-frequency filtered conditions suggest that humans rely on a set of cues to identify letters that are invariant to different scales of analysis.
Meeting abstract presented at VSS 2013