August 2012
Volume 12, Issue 9
Vision Sciences Society Annual Meeting Abstract  |   August 2012
Evaluation of a biologically-inspired neural network for letter recognition
Author Affiliations
  • Daniel Coates
    Vision Science Graduate Program, University of California, Berkeley
  • Susana T. L. Chung
    Vision Science Graduate Program, University of California, Berkeley
Journal of Vision August 2012, Vol.12, 537. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Daniel Coates, Susana T. L. Chung; Evaluation of a biologically-inspired neural network for letter recognition. Journal of Vision 2012;12(9):537.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Seeking a better understanding of how we recognize letters, we compared the letter recognition performance of human subjects with that of a biologically plausible neural network (Fukushima’s Neocognitron). This type of neural network, inspired by the architecture of the visual system, has been successful in OCR and natural scene classification, but so far has not been compared directly to human letter recognition. We were particularly interested in the errors made when letters were degraded, such as in the presence of noise.

First, we confirmed that the network is able to recognize lower-case letters, which has not been shown before. Trained on just ten presentations of each of the 26 letters in Times font, it was robust to letter rotation (+/-45°=62% correct), spatial warping (+/-50% of character size in both dimension=75% correct), and spatial translation.

Next, following the analyses of Solomon/Pelli (1994) and Chung, et al. (2002), we evaluated the model "letter channels" using stimulus filtering and filtered noise masking. Unlike the lowpass ideal observer described by Solomon/Pelli (1994), this model has a bandpass shape very similar to human observers, centered around 2-3 cycles/letter.

Finally, we compared confusion matrices from new experiments, classic published results, and model predictions. After removing bias using the Luce choice model, we examined correlations between the remaining letter similarity score matrices, indicating typical confusions between letters. Correlations between the simulation and new experimental data (subjects recognizing letters in noise) were 0.62-0.7, slightly worse than agreement between observers (r=0.8-0.9). When compared to Bouma’s (1971) matrix, which used a Courier font, the model trained on Times had a low correlation (r=0.32), while the Courier-trained model had a fit of r=0.64.

We believe the ability of this model to capture the particular letter confusions of humans makes it a promising testbed for probing intermediate-level object recognition.

Meeting abstract presented at VSS 2012


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.