Letter recognition is thought to rely on the shape and arrangement of individual features of a letter. A great deal of research has focused on identifying a set of pattern features (such as line segments and curves) mediating letter recognition (Blommaert,
1988; Bouma,
1971; Dunn-Rankin,
1968; Fiset et al.,
2008; Geyer,
1970; Geyer & DeWald,
1973; Gibson,
1969; Laughery,
1969; Luce,
1963; Rumelhart & Siple,
1974; Townsend,
1971). Insight into the nature of these features comes from the observation that high levels of letter recognition accuracy are possible even when letters are severely blurred by low-pass spatial-frequency filtering (Kwon & Legge,
2011; Loomis,
1990). These studies showed that people can achieve 80% correct for recognition of 1 out of 26 when the letters are low-pass filtered with the cutoff frequency of 0.9 cycles per letter (CPL). This low resolution has an equivalent sampling rate (<2 × 2) that would allow discrimination only among fewer than 16 patterns if the samples were binary (Shannon,
1948).
1 Furthermore, the cutoff frequency of 0.9 CPL is considerably lower than the known optimal spatial frequency for letter recognition (e.g., Chung, Legge, & Tjan,
2002; Gervais, Harvey, & Roberts,
1984; Ginsburg,
1980; Majaj, Pelli, Kurshan, & Palomares,
2002; Oruc & Landy,
2009; Parish & Sperling,
1991). It has been reported that the peak of the spatial-frequency band most useful for letter recognition ranges from about 1.7 cycles/letter for tiny characters (0.16°) to 7.7 cycles/letter for huge characters (16°; Majaj et al.,
2002).
How is letter recognition possible with such severe reduction in the spatial resolution of stimulus letters? Clues may come from understanding the possible interaction between contrast and spatial resolution in letter recognition. Legge, Rubin, and Luebker (
1987) studied the role of contrast in reading speed and found that as letter size approached the acuity limit, more contrast was needed to achieve a criterion reading speed. An interaction is also found in letter recognition in peripheral vision, in which as acuity declines, images of letters (Melmoth & Rovamo,
2003; Rovamo & Melmoth,
2000) and faces (Mäkelä, Näsänen, Rovamo, & Melmoth,
2000; Melmoth, Kukkonen, Mäkelaä, & Rovamo,
2000) need to be scaled not only in size but also in contrast to match recognition performance in central vision. The interaction between contrast and spatial resolution of letters has been recognized in clinical practice, and there has been interest in measuring visual acuity with low-contrast charts. For example, the Regan Letter Chart, a low-contrast letter-acuity chart, has been used to assess the effects of reduced contrast on visual acuity (Regan,
1988; Regan & Neima,
1983).
Our interest in understanding letter recognition under conditions of low-resolution viewing is motivated by real-world applications. Examples include reading near the acuity limit (highway signs at a great distance) or coping with fog, low-resolution display rendering, refractive error, or low vision. To our knowledge, no study has addressed contrast requirements for recognizing letters with low resolution. Our primary goal is to examine the impact on letter recognition of the interaction between the contrast of letters and the spatial resolution with which they are rendered. More specifically, as the spatial-frequency cutoff of letters is reduced (in cycles per letter), what is the effect on the contrast threshold for detecting and recognizing letters?
A second goal is to determine whether this interaction helps us to understand differences in letter recognition between central and peripheral vision or between upper and lowercase letters. Differences in the shape of the human contrast sensitivity function (CSF) mean that letter stimuli in peripheral vision have reduced spatial-frequency representations for neural processing. Uppercase letters may be thought to be rendered with lower resolution than lowercase letters because they possess fewer distinguishable spatial features (e.g., no ascenders or descenders). The idea that some stimulus conditions are more vulnerable to low spatial resolution than others is indeed substantiated by our related study (Kwon & Legge,
2011). In that study, we found that to achieve reliable letter recognition (80% accuracy), spatial resolution of letters had to be higher (manifested as a larger minimum spatial-frequency requirement) in peripheral (1.06 cycles per letter) than central vision (0.9 cycles per letter) and for uppercase (1.14 cycles per letter) than lowercase letters (0.9 cycles per letter). In the current experiments, we measured contrast thresholds for detecting and recognizing single letters in central and peripheral vision, drawn at random from the 26 lower and uppercase letters of the English alphabet. The letters were low-pass filtered (blurred) with various cutoff frequencies. We used the size of the gap between detection and recognition contrast thresholds for letters as a measure of the contrast requirement in the letter recognition task. We did so because the gap between detection and recognition thresholds would reflect the true contrast requirement for recognition after factoring out any possible differences in detection threshold induced by different stimulus conditions. Comparing recognition and detection thresholds in this way also provided us with a useful way of understanding the performance of the ideal-observer model and the similarity model to be discussed later.
A third goal of the present study is to examine whether the greater contrast requirement for recognition of letters at low spatial resolution, if any, is inherent in the stimulus or an intrinsic property of human visual processing. An ideal-observer model, a theoretical device that yields the best possible performance for a given task via a strategy of choosing the maximum posterior probability (Green & Swets,
1966; Tanner & Birdsall,
1958), is a quantitative method for demonstrating the stimulus constraints on performance. Some limitations on early visual processing such as visual acuity and contrast sensitivity can be thought of as transforming the stimulus input, producing an “equivalent” visual input. For instance, the human CSF puts a lower bound on contrast that can be detected. For modeling purposes, it can be useful to treat these early sensory limitations as transformations of the stimulus input. Previous studies of letter recognition have demonstrated that incorporating the human CSF in an ideal-observer model can explain aspects of the spatial-frequency characteristics of human letter recognition (Chung et al.,
2002; Kwon & Legge,
2011; Watson & Ahumada,
2008). We incorporated a human CSF into the ideal-observer model (we now call it the CSF-noise-ideal-observer model). We tested this model for the tasks of low-pass–filtered letter detection and recognition for comparison with human performance.