When the size of a letter stimulus is near the visual acuity limit of a human subject, details of the stimulus become unavailable due to ocular optical and neural filtering. In this study we tested the hypothesis that letter recognition near the acuity limit is dependent on more global features, which could be parsimoniously described by a few easy-to-visualize and perceptually meaningful low-order geometric moments (i.e., the ink area, variance, skewness, and kurtosis). We constructed confusion matrices from a large set of data (approximately 110,000 trials) for recognition of English letters and Chinese characters of various spatial complexities near their acuity limits. We found that a major portion of letter confusions reported by human subjects could be accounted for by a geometric moment model, in which letter confusions were quantified in a space defined by low-order geometric moments. This geometric moment model is universally applicable to recognition of visual patterns of various complexities near their acuity limits.

*N*

_{x}-by-

*N*

_{y}pixel spatial pattern

*f*(

*x, y*), the double-sequence quantities defined in Equation 1 are the order (

*p*+

*q*) GMs of the pattern

*M*

_{p,q}} (

*p, q*= 0, 1, 2,…) are sufficient to uniquely specify a finite spatial function

*f*(

*x, y*) (Hu, 1962).

*M*

_{p,q}is the inner product of the geometric moment basis function

*x*

^{p}

*y*

^{q}and the spatial pattern

*f*(

*x, y*).

*x*

^{ p}

*y*

^{ q}. In GM analysis, a 2-D image is decomposed into basis functions

*x*

^{ p}

*y*

^{ q}, which are luminance distributions confined within the boundary of the image, and

*M*

_{ p,q}are scalars that represent the weights of the basis functions (the contributions of basis functions to the 2-D image). Figure 1b shows basis functions of some low-order GMs. Notice that low-order basis functions are simple luminance distributions or features, and the GMs are the weights or contributions of these features. Also notice that the basis functions of low-order GMs are low spatial frequency features in a sense that the luminance changes in these features are gradual. Figure 1b also shows a letter “E,” and the result of reconstruction using GMs up to the 5th order. While this 5th-order reconstruction is missing many details, it contains adequate information for many purposes, for example, for identifying the opening of tumbling E's in an acuity test. When more moments are used in the reconstruction, details such as the center bar of the “E” will become evident. It is also worth mentioning that while pure GMs, such as

*M*

_{0, q}and

*M*

_{ p,0}, are 1-D basis functions, comparable to Gabors, mixed GMs, such as

*M*

_{ p,q}(

*p*≠ 0 and

*q*≠ 0), are true 2-D basis functions, comparable to plaids.

*x*- and

*y*-distributions of ink on a rectangular domain that is too small to engage receptive fields with a large variety of envelopes. They are appropriate features to approximate global features of barely resolvable Sloan letters and Chinese characters because these stimuli have a fixed orientation and a predominantly orthogonal structure. For example, the observation that a human observer can determine the orientation of a Landolt C or a Snellen E by judging which side of the stimulus has less ink could be readily explained by the difference in the 3rd-order GMs (skewness) of the ink distribution. The contribution of height-to-width quotient to letter recognition demonstrated by Bouma (1971) could be associated with the ratio of 2nd-order GMs in

*y*- and

*x*-directions. Specifically, we propose a feature analysis model for letter recognition near the acuity limit, in which the feature set consists of global characteristics of stimulus patterns quantified by low-order GMs, and recognition is achieved by comparing GM compositions of letters. Because recognition confusions occur when crucial features are shared, we used the model to analyze the patterns of letter confusions obtained from identifying letters that were slightly above the acuity limit. In order to demonstrate that this model could be applied to a wide range of over-learned patterns, we analyzed English letters and 6 groups of Chinese characters that spanned a wide range of spatial complexities. We were able to demonstrate that the Euclidean distances in a low-order GMs space could explain a large portion of human errors made in recognizing letters near the acuity limit.

*i, j*) cell contained the probability of the

*i*th stimulus letter being reported as the

*j*th letter,

*c*

_{i,j}. The diagonal line entries

*c*

_{i,i}were probabilities of correct responses, and the off-diagonal line entries (

*i*≠

*j*) were errors, or confusions. Because the

*i*th column of a CM contained all the responses to the

*i*th stimulus, the sum of the column was equal to 1.0.

*x*and

*y*so that the moments are location and size invariant (Alt, 1962). The coordinates of the center of gravity of the pattern are

*L*=

*f*(

*x, y*) is the mean luminance. The variances of the pattern in

*x*- and

*y*-directions are defined as

*m*

_{0,0}defined in Equation 5 is equal to 1.0. Because the mean luminance of the pattern might provide important information for pattern recognition, we set

*m*

_{0,0}equal to

*L*. The 1st-order moments,

*m*

_{1,0}and

*m*

_{0,1}, are 0 for all patterns because the coordinate was shifted to the centroid of the pattern. They were not used in the following simulation because the task was to recognize a single letter. Information about centroids of multiple letters is important in tasks such as word recognition or reading. The 2nd-order moments

*m*

_{2,0}and

*m*

_{0,2}defined in Equation 5 are equal to 1.0. Because the width/height ratio might be informative in letter recognition, we set

*m*

_{2,0}and

*m*

_{0,2}to

*σ*

_{ x}and

*σ*

_{ y}defined in Equation 3. We subtracted 3 from the kurtosis in the

*x*- and

*y*-directions (

*m*

_{4,0}and

*m*

_{0,4}) defined in Equation 5 to comply with the common practice that when the distribution along the

*x*- or

*y*-axis was a Gaussian, the kurtosis was zero.

*f*(

*x, y*) = 0 or 1) in Figure 1a, the 0th-order GM is the number of ink pixels, which can be perceived as the general lightness or darkness of a letter. In Figure 1a, patterns with more strokes have a larger

*m*

_{0,0}than those with fewer strokes and appear darker if individual strokes cannot be distinguished. Pure 2nd-order moments (moments that are 0-order in one direction)

*m*

_{2,0}and

*m*

_{0,2}are dispersion of ink distribution. For the vertical bar in Figure 1a (height/width = 5/1),

*m*

_{2,0}<

*m*

_{0,2}(2.872 vs. 14.431). For the horizontal bar,

*m*

_{2,0}>

*m*

_{2,0}(14.431 vs. 2.872). In fact, when only GMs up to the second order are considered, the original image is completely equivalent to a constant irradiance ellipse, whose size, orientation, aspect ratio, and center are completely specified by the GMs (Teague, 1980). Pure 3rd-order moments

*m*

_{3,0}and

*m*

_{0,3}represent the skewness of ink distributions on the

*x*- and

*y*-directions. For the letter “

**E**,”

*m*

_{3,0}has a positive value (0.18), but

*m*

_{0,3}is zero. Perceptually, the letter appears darker on the left side (skewed to the left), but appears symmetric in the vertical direction. The distribution of letter “

**L**” is heavily skewed to the left in the horizontal direction and to the bottom in the vertical direction (

*m*

_{3,0}= 0.85,

*m*

_{0,3}= −0.85). Notice that the skewness is both directional specific (

*m*

_{2,0}vs.

*m*

_{0,2}) and positional specific (positive vs. negative). Pure 4th-order moments

*m*

_{4,0}and

*m*

_{0,4}specify whether ink distributions on the

*x*- and

*y*-directions are more peaked or flat-topped than a Gaussian. In letter “

**H**”, the distribution along

*x*-axis is the lowest in the middle, due to the two vertical strokes, and thus

*m*

_{4,0}has a large negative number (−1.72). The distribution of letter “

**T**”, on the other hand, has a strong peak in the middle, and

*m*

_{4,0}has a positive number (0.046). Mixed moments indicate the clustering of ink pixels around oblique axes. For example,

*m*

_{2,2}is large for letter “

**X**”, but small for Chinese character □ (1.60 vs. 0.81).

*μ*

_{1},

*μ*

_{2}, …,

*μ*

_{ n}. For example, if a model used ink area,

*x*- and

*y*-direction skewness and

*x*- and

*y*-direction kurtosis, then these moments were denoted as

*μ*

_{1},

*μ*

_{2},

*μ*

_{3},

*μ*

_{4}, and

*μ*

_{5}. In a model involving

*n*moments, each stimulus was represented by a vector {

*μ*

_{1},

*μ*

_{2}, …,

*μ*

_{ n}}, in an

*n*-dimensional moment space. The difference between the

*i*th and

*j*th stimuli,

*d*

_{ i,j}, was measured by the distance between these stimuli in the

*n*-dimensional moment space. To reflect different contributions of moments to the recognition of a set of stimuli, a weighting,

*w*

_{ k}, was given to each moment dimension:

*k*letters, {

*d*

_{i,j}} forms a

*k*-by-

*k*symmetric matrix with zeros on the diagonal. To simulate human recognition performance,

*d*

_{i,j}needed to be converted to a measure of perceptual similarity. There are both empirical evidence and theoretical justification (Shepard, 1987) that the conversion should be monotonic and should take an exponential shape:

*s*

_{ i,j}was called “measure of stimulus generalization” (Shepard, 1987) or “similarity scale” (Luce, 1963b) and was used in several previous studies of human letter recognition (Getty et al., 1979; Keren & Baggen, 1981; Loomis, 1990). For a set of

*k*letters, {

*s*

_{i,j}} was a

*k*-by-

*k*similarity symmetric matrix. Because

*d*

_{i,i}= 0, similarity between a stimulus and itself,

*s*

_{i,i}= 1.0, which sets the scale for similarity. The fact that all diagonal entries of similarity matrix {

*s*

_{i,j}} equal 1.0 simply indicates that all stimuli had the same degree of similarity to themselves. The free parameter in Equation 7,

*τ,*determined how fast similarity fell from perfection. Intuitively, the probability of confusion between the

*i*th and

*j*th letter should be proportional to perceptual similarity

*s*

_{i,j}, but

*s*

_{i,j}was not the empirical probability of the

*i*th letter being reported as the

*j*th letter, because the column sum of matrix {

*s*

_{i,j}} was not 1.0. A similarity matrix {

*s*

_{i,j}} was converted into a theoretical CM through a column-wise normalization (Luce, 1963b):

*k*-by-

*k*CM {

*c*

_{ i,j}} differed from {

*s*

_{ i,j}} in several ways. First, each column of {

*c*

_{ i,j}} summed up to 1.0, indicating that it summarized all responses to a stimulus letter. Second, the diagonal line entries were no longer equal, due to the column-wise scaling. They were now the rates of correct recognition. Finally, {

*c*

_{ i,j}} was no longer symmetric, again due to the scaling of Equation 8. This asymmetry, however, was not caused by response bias human subjects produced in letter recognition experiments, as we will discuss later. The outcome of a GM model was thus a theoretic CM {

*c*

_{ i,j}}. For a model with

*n*GMs, there were

*n*free parameters, including the

*τ*for the similarity matrix and the

*n*− 1 independent weightings of the moment space.

*m*

_{0,0}was used in all GM models, and the 1st-order GMs,

*m*

_{1,0}and

*m*

_{0,1}, were never used. This is because when the center of the coordinate was moved to the center of gravity of stimulus image ( Equation 4),

*m*

_{1,0}and

*m*

_{0,1}became 0 for all stimuli. Model GM1(lu) used

*m*

_{0,0}(lu) only. Models GM3(lu,va), GM3(lu,sk), and GM3(lu,ku) used

*m*

_{0,0}(lu) and one pair of pure GMs,

*m*

_{2,0}and

*m*

_{0,2}(va),

*m*

_{3,0}and

*m*

_{0,3}(sk), and

*m*

_{4,0}and

*m*

_{0,4}(ku). Models GM3(lu,va,sk), GM3(lu,va,ku), and GM3(lu,sk,ku) used

*m*

_{0,0}and two pairs of pure GMs. We did not use the 2nd order GM

*m*

_{1,1}to construct GM3 and GM5 models because it only had a non-zero value when there was a gross difference in ink distribution between opposite corners of a letter, as in a letter “L” ( Figure 1a). Because most of the stimulus characters shown in Figure 2 filled the corners of a square area rather symmetrically, values of

*m*

_{1,1}are very close to zero. The average values of

*m*

_{2,0}and

*m*

_{0,2}of the 70 characters were 26.2 ± 0.51 and 25.7 ± 0.50, respectively, but the average of the absolute values of

*m*

_{1,1}was only 0.023 ± 0.003. GM7 had all pure GMs up to the 4th order. GM9 added

*m*

_{1,1}and

*m*

_{2,2}to GM7. GM13 used all GMs up to the 4th order, excluding

*m*

_{1,0}and

*m*

_{0,1}. GMs higher than the 4th order were not discussed because we found adding 5th-order GMs resulted in little change in fitting empirical CMs.

Models | Key features/number of free parameters | Simulation details | Correlation with empirical | ||||||
---|---|---|---|---|---|---|---|---|---|

Sloan | CC1 | CC2 | CC3 | CC4 | CC5 | CC6 | |||

CHOICE | Similarity matrix and response bias vector. 54 parameters | Interpreting empirical CM. Not involving stimulus. | 0.921 | 0.949 | 0.960 | 0.958 | 0.918 | 0.954 | 0.910 |

GM13/LM13 | All GMs or LMs up to
4th order, excluding m _{1,0} and m _{0,1} | GM No Bias Correction | 0.768 | 0.896 | 0.642 | 0.779 | 0.628 | 0.747 | 0.617 |

14 parameters | GM Bias Correction | 0.829 | 0.895 | 0.685 | 0.868 | 0.684 | 0.789 | 0.633 | |

LM No Bias Correction | 0.792 | 0.895 | 0.665 | 0.800 | 0.611 | 0.799 | 0.701 | ||

GM9/LM9 | m _{0,0}, m _{2,0}, m _{1,1}, m _{0,2}, m _{3,0}, m _{0,3}, m _{4,0}, m _{2,2}, m _{0,4} | GM No Bias Correction | 0.698 | 0.861 | 0.547 | 0.723 | 0.477 | 0.672 | 0.517 |

10 parameters | GM Bias Correction | 0.713 | 0.858 | 0.578 | 0.793 | 0.554 | 0.665 | 0.528 | |

LM No Bias Correction | 0.680 | 0.779 | 0.514 | 0.739 | 0.408 | 0.711 | 0.528 | ||

GM7/LM7 | m _{0,0}, m _{2,0}, m _{0,2}, m _{3,0}, m _{0,3}, m _{4,0}, m _{0,4} | GM No Bias Correction | 0.694 | 0.770 | 0.392 | 0.708 | 0.366 | 0.673 | 0.392 |

8 parameters | GM Bias Correction | 0.706 | 0.768 | 0.483 | 0.765 | 0.410 | 0.671 | 0.395 | |

LM No Bias Correction | 0.680 | 0.779 | 0.317 | 0.739 | 0.408 | 0.711 | 0.528 | ||

GM5 _{1}/LM5 _{1} | m _{0,0}, m _{2,0}, m _{0,2}, m _{3,0}, m _{0,3} | GM No Bias Correction | 0.690 | 0.698 | 0.327 | 0.558 | 0.212 | 0.673 | 0.335 |

6 parameters | GM Bias Correction | 0.699 | 0.697 | 0.376 | 0.642 | 0.241 | 0.671 | 0.339 | |

LM No Bias Correction | 0.670 | 0.697 | 0.374 | 0.609 | 0.175 | 0.711 | 0.402 | ||

GM5 _{2}/LM5 _{2} | m _{0,0}, m _{2,0}, m _{0,2}, m _{4,0}, m _{0,4} | GM No Bias Correction | 0.578 | 0.676 | 0.290 | 0.603 | 0.247 | 0.568 | 0.280 |

6 parameters | GM Bias Correction | 0.565 | 0.668 | 0.188 | 0.687 | 0.247 | 0.571 | 0.277 | |

LM No Bias Correction | 0.594 | 0.760 | 0.519 | 0.687 | 0.373 | 0.571 | 0.463 | ||

GM5 _{3}/LM5 _{3} | m _{0,0}, m _{3,0}, m _{0,3}, m _{4,0}, m _{0,4} | GM No Bias Correction | 0.692 | 0.746 | 0.392 | 0.708 | 0.366 | 0.576 | 0.396 |

6 parameters | GM Bias Correction | 0.703 | 0.741 | 0.483 | 0.765 | 0.410 | 0.568 | 0.396 | |

LM No Bias Correction | 0.680 | 0.734 | 0.519 | 0.686 | 0.370 | 0.568 | 0.402 | ||

GM3 _{1}/LM3 _{1} | m _{0,0}, m _{2,0}, m _{0,2} | GM No Bias Correction | 0.577 | 0.556 | 0.142 | 0.427 | 0.160 | 0.578 | 0.109 |

4 parameters | GM Bias Correction | 0.565 | 0.508 | 0.111 | 0.489 | 0.221 | 0.574 | 0.084 | |

LM No Bias Correction | 0.595 | 0.563 | 0.147 | 0.431 | 0.119 | 0.596 | 0.321 | ||

GM3 _{2}/LM3 _{2} | m _{0,0}, m _{3,0}, m _{0,3} | GM No Bias Correction | 0.395 | 0.461 | 0.278 | 0.446 | 0.118 | 0.575 | 0.225 |

4 parameters | GM Bias Correction | 0.414 | 0.472 | 0.319 | 0.489 | 0.142 | 0.578 | 0.254 | |

LM No Bias Correction | 0.351 | 0.423 | 0.238 | 0.492 | 0.068 | 0.592 | 0.200 | ||

GM3 _{3}/LM3 _{3} | m _{0,0}, m _{4,0}, m _{0,4} | GM No Bias Correction | 0.600 | 0.633 | 0.290 | 0.603 | 0.292 | 0.531 | 0.172 |

4 parameters | GM Bias Correction | 0.571 | 0.614 | 0.283 | 0.687 | 0.187 | 0.534 | 0.089 | |

LM No Bias Correction | 0.594 | 0.552 | 0.374 | 0.609 | 0.384 | 0.509 | 0.220 | ||

GM1/LM1 | m _{0,0} | GM No Bias Correction | 0.476 | 0.311 | 0.213 | 0.354 | 0.037 | 0.595 | 0.072 |

2 parameters | GM Bias Correction | 0.319 | 0.178 | 0.136 | 0.347 | −0.197 | 0.575 | 0.040 | |

LM No Bias Correction | 0.476 | 0.311 | 0.213 | 0.354 | 0.037 | 0.595 | 0.072 | ||

CSFTM | Template matching | No Bias Correction | 0.729 | 0.649 | 0.408 | 0.447 | 0.155 | 0.640 | 0.292 |

1 parameter | Bias Correction | 0.761 | 0.664 | 0.453 | 0.549 | 0.205 | 0.656 | 0.337 | |

RANDOM | Random off-diagonal line elements | Average of 500 CMs | 0.202 | 0.136 | 0.154 | 0.195 | 0.299 | 0.151 | 0.120 |

_{Sloan}, the well-known confusions in English letters, such as “C” vs. “O,” “D” vs. “O,” and “N” vs. “H,” were reproduced. These prominent confusions were most likely to be associated with perceptual similarities between characters, and thus could not be the result of random guessing. We constructed several versions of “random confusion” matrices, including Townsend's (1971) “equiprobable” (every cell had the same value), an “equal legibility” (all diagonal line elements contained the mean legibility of the empirical CM, and all off-diagonal line elements contained mean confusion), and a CM that retained the empirical relative legibility while all incorrect reports were evenly distributed in the nine off-diagonal cells of a column.

*χ*

^{2}-tests between the empirical CMs and all these random CMs were conducted, and the results showed that the probability that any of the empirical CMs were produced by random reporting was <0.0005.

*R*

^{2}of the linear regression model, which had values of 0.163, 0.484, 0.636, and 0.896 for GM1(lu), GM7, GM13, and CHOICE models, respectively.

*m*

_{0,0}), the correlations between GM1(lu) model CMs and empirical CMs were higher than the lower boundary defined by RANDOM CM's in 5 out of 7 stimulus groups. For example, the theoretical confusions obtained using the mean luminance values of Sloan letters and CC5 characters correlated with corresponding empirical confusions at 0.476 and 0.595, respectively, suggesting that even within these groups of stimuli of relatively uniform spatial complexity, letters of similar mean luminance were more likely to get confused when their sizes were close to acuity limits. With 7 GMs, correlation coefficients were 0.362–0.770. When all 13 GMs up to 4th order (excluding

*m*

_{1,0}and

*m*

_{0,1}) were used, the correlation coefficients were at least 0.617. For Chinese characters in CC1, the correlation was about 0.896, approaching the level of performance of the 54-parameter CHOICE model.

*β*

_{ j}} vector ( Equation A2) for each empirical CM. Because GM models produced symmetric similarity matrices, we could incorporate bias vectors estimated by the CHOICE model into these models. Specifically, a stimulus-driven similarity matrix {

*s*

_{ i,j}}, obtained from a GM model was combined with the bias vector {

*β*

_{ j}} obtained from the CHOICE model to produce a biased theoretical CM. The optimization procedure was the same, because it was related only to the stimulus.

*p*< 0.005), indicating a faithful restoration of the bias observed in empirical CMs. Correlation coefficients between empirical CMs and biased theoretical CMs were shown as the second number in each cell in Table 1. It appeared that biased theoretical CMs generally correlated with empirical CMs better than unbiased, but the improvement was moderate at best.

*H*/

*W*< 1.16,

*H*/

*W*> 1.16, and

*H*/

*W*> 1.22, could be differentiated by

*m*

_{2,0}/

*m*

_{0,2}. The left, upper, right, and lower gaps could be perceived as skews of pattern luminance in different positions and thus can be differentiated by relative values and/or signs of

*m*

_{0,3}and

*m*

_{3,0}. The rectangular envelope and circular envelope can be differentiated by the 4th-order GM

*m*

_{2,2}, which has a significantly higher value for a square than for a circle (1.0 vs. 0.664, regardless sizes). The fact that many perceptually important features identified by Bouma can be described by low-order GMs suggests that GMs provide a theoretical description of the basic features human subjects use to recognize letters near the acuity limit.

*w*

_{ k}values ( Equation 6) may exist that reflect the intrinsic property of moment analysis, for example, the weights should be lower for higher order moments because they contribute less in low-pass-filtered stimuli. Our results, however, failed to reveal any consistent trend for

*w*

_{ k}values. Only in CC5,

*w*

_{ k}values decreased with GM order. In CC1, CC4, and CC6,

*w*

_{ k}values first increased from 0th to 3rd order of GM, and then dropped at 4th-order GM. In Sloan and CC2,

*w*

_{ k}values were the lowest at 2nd-order GM, went up at 3rd order, and then dropped at 4th order. In CC3,

*w*

_{ k}values increased steadily from 2nd to 4th order. Our current understanding is that

*w*

_{ k}values are very specific to the characteristics of the stimulus group. For example, if characters in a stimulus group all have similar ink pixels, then the 0th-order GM would contribute very little to differentiating characters in the group, and the weight for

*m*

_{00}will be small. This happened in CC6 where the variation of total number of pixels was reduced because a large number of strokes were packed in the same area. As a consequence,

*m*

_{00}did not contribute much to differentiating CC6 characters, and the weight for the 0th-order GM is very small. We have not yet found a satisfying way to explain the observed relationship between model weights and GM order.

*AIC*=

*N*− ln (

*SS*/

*N*) + 2

*K,*where

*N*is the number of data points to be fitted (90, if all confusions of a 10 × 10 CM are considered),

*SS*is the residual sum of squares, and

*K*is the number of free parameters. If A is a simpler model and B is a more complex one, then Δ

*AIC*=

*AIC*

_{B}−

*AIC*

_{A}can help to determine the model that explains the data well with fewer free parameters. An evidence ratio, defined as 1/

*e*

^{−0.5ΔAIC}, is usually used to quantify how much more correct one model compared to the other. An alternative criterion, Bayesian Information Criterion (BIC), defined as

*BIC*=

*N*ln(

*SS*/

*N*) +

*K*ln(

*N*), has a stiffer penalty for parameter usage (Schwarz, 1978). These criteria obviously will pass different judgments on relative merits of models with different number of parameters. We used both AIC and BIC to give a more comprehensive view. Another advantage of AIC and BIC analyses is that they allow ranking multiple models according to their merits without having to adjust statistical criteria (

*α*) like in multiple hypothesis tests.

^{11}).

*F*

_{2,12}= 1.179,

*p*= 0.341).

*F*

_{2,120}= 0.182,

*p*= 0.893) or filter*model interaction (

*F*

_{18,120}= 0.419,

*p*= 0.982). These results indicated that the GM models are not very sensitive to the details of the front-end filtering. This should not be surprising, because the GM models used only the global features of the characters, which were attenuated to some degree but not eliminated or distorted by these linear filters. The insensitivity of GM model performance to the type of the front-end filter may also be explained by the feature matching process used by the model. We used the feature list extracted from a filtered stimulus character to match a set of stored feature lists that were extracted from identically filtered characters. This approach was appropriate in our study, where subjects had done thousands of trials of recognizing characters close to acuity thresholds. It would be interesting to investigate the effect of front-end filters on GM models when the stimulus feature set is extracted from a filtered character while the store feature set is extracted from unfiltered characters. This situation may occur when the subject has learned the stimuli at very large sizes and is first exposed the stimulus at sizes close to the acuity limits.

_{Sloan}, for example, “H” was much more legible than “R” (75% correct vs. 38%). This difference contributed to the large pair-wise asymmetry in which the probability of an “H” being reported as an “R” was 0.012 while an “R” being reported as an “H” was 0.123. Indeed, when we generate random CMs, we could produce substantial asymmetries among confusion entries if large variations in the diagonal line entries were introduced. Therefore, unless the stimuli are truly equally legible, asymmetry of an empirical CM is not a pure measure of subjects' response bias.

*P*

_{ p}() and

*P*

_{ q}() are Legendre polynomial of the

*p*th and

*q*th orders, and

*x*

_{ i}and

*y*

_{ j}are normalized pixel coordinates in a unit square (Teague, 1980). Computational studies have demonstrated that orthogonal moments, such as

*L*

_{p,q}, are more efficient in reconstructing details of images. Ghorbel, Derrode, Dhahbi, and Mezhoud (2005) demonstrated that a 32 × 32 pixel “E” reconstructed from up to 50th-order GM produced 74 pixels that differed from the original. Only up to 8th-order LMs were needed to reach the same level of reconstruction accuracy. Human pattern recognition, on the other hand, typically does not require that much detail. Do LMs explain human recognition of small patterns better than GMs? Correlation coefficients between CMs produced by LM-based feature analysis models and empirical CMs were shown as the third number in cells in Table 1. LM models were slightly better than corresponding GM models in some cases, but the differences were generally small. For the 10 corresponding GM and LM models, the differences in empirical–theoretical correlations averaged across the 7 stimulus groups ranged from +0.019 (GM better than LM) to −0.083 (LM better than GM). Comparing to the dramatically increased efficiency of LM in reconstruction of images (8th order vs. 50th order), the improvement of LM over GM in interpreting human letter confusions near the acuity limit was mediocre. One possible reason is that the basis function of an LM of (

*p*+

*q*)th order is a linear combination of GM basis functions up to the (

*p*+

*q*)th order (Teague, 1980). It can be shown that at the lower orders that were most relevant to recognition of letters near the acuity limit, the basis functions of LM are similar to those of GM, and an LM was either the same order GM times a constant or a linear sum of two GMs of equal or lower order. For example,

*L*

_{0,0}=

*m*

_{0,0},

*L*

_{2,0}= (5/4)[(3/2)

*m*

_{2,0}− (1/2)

*m*

_{0,0}],

*L*

_{1,1}= (9/4)

*m*

_{1,1},

*L*

_{3,0}= (7/4)[(5/2)

*m*

_{3,0}− (1/2)

*m*

_{0,0}], and so on. Therefore, the difference between LM and GM compositions was relatively small when human recognition of small letters was concerned. It can be seen in Table 1 that GM1(lu) and LM1(lu) models produced identical simulation results. In other models, some LM space dimensions were linear combinations of two GM dimensions, and adjusting weightings on these dimensions might not produce the same results as in the GM space.

*a*= 812.3,

*b*= 1.071,

*c*= 0.636. The Chung, Legge and Tjan (CLT) CSF data are replotted in Figure A1a with the model fitting. A radial-symmetric 2-D filter in the frequency domain was then created, as shown in Figure A1b. The original stimuli were black-and-white bitmaps of Sloan letters and Chinese characters shown in Figure 2. All stimulus bitmaps were 50 × 50 pixels in size. Because the bitmaps were used as stimuli 0.1 log unit above the acuity sizes shown in Figure 2, the resolutions ranged 700 pixels/deg for the Sloan letters to 398 pixels/deg for CC6. The stimulus bitmaps were pasted on a large black background before being filtered. For each stimulus letter, the Fourier transform of its bitmap was multiplied with the CLT CSF filter, and then the inverse Fourier transform was taken. The result was a highly blurred version of the letter, shown in Figure A1c. The filtered stimulus was cropped to 1.2× of original stimulus size to save time in simulation. Cropped sizes of up to 3× of original stimulus size were tested, and no significant differences in simulation results were noticed. Procedures defined in Equations 2–5 were used to extract a set of

*n*GMs, and a theoretic CM {

*c*

_{i,j}} with

*n*+ 1 free parameters was generated according to Equations 6–8. An optimization routine was used to look for the best fitting parameters so that the sum of squared differences between a theoretic CM and the corresponding empirical CM was minimal. One of the debatable issues in fitting CMs is how to treat the diagonal line entries. In a typical letter recognition study, the average correct rate is set at 50%–75%, which means that the diagonal line entries are much larger than off-diagonal line entries. If the whole CM is fitted, the diagonal line entries are likely to dominate the fitting and may result in an over-optimistic goodness of fit, in which the fitting of the confusions can be quite poor (LeBlanc & Muise, 1985). In our study, we optimized both the whole CM and confusions entries.

*r*given a stimulus

*s*was determined by two scales, a similarity between

*r*and

*s,*and response bias. Townsend (1971) applied the choice model to recognition of uppercase English letters. The probability of the

*i*th letter being reported as the

*j*th letter was given by

*η*

_{ij}} was a symmetric similarity matrix, and {

*β*

_{k}} was the bias vector. The purpose of Townsend's choice model was to derive the unknown {

*η*

_{ij}} and {

*β*

_{k}} from a known empirical CM. Therefore, for an experiment involving

*k*letters, the model estimated [

*k*(

*k*+ 1)/2] − 1 parameters,

*k*(

*k*− 1)/2 pair-wise similarities, and (

*k*− 1) independent relative biases. For an experiment involving recognizing 10 letters, the number of parameters was 54. Because of the enormous number of parameters used, the choice model CM defined in Equation A2 usually provides an excellent fit of the empirical CM and thus can serve as a practical upper limit for goodness of fit. Explicit formulas provided by Townsend ( 1971) were used to calculate choice CM, similarity matrix, and response bias vector for each of our 7 empirical CMs.

*S*and a filtered template letter

*T,*the sum of squared difference is

*k*-letter recognition experiment, {

*D*

_{s,t}} is a

*k*-by-

*k*matrix with zeros on the diagonal, similar to {

*d*

_{i,j}} in Equation 6. We used procedures similar to Equations 7 and 8 to create a CSF template-matching similarity matrix and a confusion matrix. The CSFTM model has only one free parameter,

*τ*in Equation 7.

*D*

_{ s,t}. The best result among the 25 relative positions was taken as the

*D*

_{ s,t}. Seven and 9-pixel relative shifts were tested and no significant differences in simulation results were noticed. Loomis ( 1990) used similar method to study CMs of 26 English letters. Instead of a CSF, Loomis used a low-pass filter to simulate the effect of ocular optics.

*f*(

*x, y*) can be completely specified by geometric moments {

*M*

_{ p,q}} (

*p, q*= 0, 1, 2,…) (Hu, 1962), one should also be able to reconstruct

*f*(

*x, y*) from {

*M*

_{p,q}}. However, reconstruction with GM is not as straightforward as an inverse Fourier transform (Ghorbel et al., 2005), because the basis functions of GM are not orthogonal. Teague (1980) proposed a moment matching method to reconstruct any function

*f*(

*x, y*) from GMs up to a given order

*N*

_{max}. The idea is to obtain a continuous function

*g*(

*x, y*) =

*g*

_{00}+

*g*

_{10}

*x*+

*g*

_{01}

*y*+

*g*

_{20}

*x*

^{2}+

*g*

_{11}

*xy*+

*g*

_{02}

*y*

^{2}+ …, whose GM exactly match those of

*f*(

*x, y*) up to the order

*N*

_{max}. The constant coefficients

*g*

_{pq}of

*g*(

*x, y*) are determined by

*M*

_{p,q}, uniquely determines all the coefficients of

*g*(

*x, y*). In theory, exact reconstruction of an

*N*

_{x}×

*N*

_{y}image can be made by using moments

*M*

_{p,q}, where

*p*= 1, 2,…,

*N*

_{x}, and

*q*= 1, 2,…,

*N*

_{y}. We used our implementation of this method in Matlab to create the reconstruction shown in Figure 1b. One inconvenience of moment matching reconstruction is that the coefficients

*g*

_{jk}depend on the order of the moments used in reconstruction. For example,

*g*

_{22}for the same image has different values for

*N*

_{max}= 4 and

*N*

_{max}= 5. This is because each coefficient

*g*

_{jk}is a linear combination of

*M*

_{p,q}up to the

*N*

_{max}order.