Abstract
Reading is a highly complex process that involves the integration of visual sensory, oculomotor, and linguistic knowledge. As the human brain acquires relevant information via eye movements, understanding these eye movement patterns during reading can help elucidate how the brain searches for and integrates relevant low-level sensory and high-level cognitive information. Using Mr. Chips - an ideal observer computational model for reading, Legge et al. (1997) demonstrated that visual, oculomotor, and lexical information can yield optimal eye movement behavior during reading. Here we upgrade Mr. Chips model to represent a key aspect of the reading process, i.e., language context. To this end, we incorporate a transformer-based language model GPT2 into the Mr. Chips model. The transformer utilizes contextual information to compute the likelihood of upcoming words based on preceding words. Like Mr. Chips, the task of our Mr. Chips Jr. is to read through the text with the minimum number of saccades by minimizing entropy while optimally utilizing the statistical properties of the information available in the text. Entropy calculation employs the text information available through the retina, word alternatives using a lexicon, prediction of the next word using the previously identified words as context, and the knowledge of oculomotor noise distribution. The model’s and human observers' reading behaviors were tested and compared on the same text excerpts from books at various reading levels with either regular or scrambled text. We evaluate saccade strategies in terms of saccade lengths, saccade landing positions, and number of forward and backward saccades. The simulation results show that eye movement patterns are comparable to those of human observers under various experimental conditions. Our findings suggest that the transformer-based computational model may serve as a valuable tool for studying human eye movements for natural reading in normal and clinical populations.