Purchase this article with an account.
Chris R. Sims, Brett R. Fajen; A reinforcement learning model of visually guided braking. Journal of Vision 2007;7(9):151. doi: 10.1167/7.9.151.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Models of continuously controlled visually guided action consist of laws of control that describe how informational variables map onto action variables. These models suffer from at least three problems. First, they are far too rigid to capture the flexibility that humans exhibit when adapting to changes in the environment, the dynamics of the controlled system, and costs associated with making different kinds of errors. Second, existing models tend to ignore the inherent limitations of human perceptual and motor systems. Third, there is no compelling account of how laws of control are learned through experience. Reinforcement learning (RL) provides a potentially powerful framework for developing models of VGA that address the weaknesses of existing models. We developed a RL model of visually guided braking that simulates how an agent might learn a behavioral policy that maximizes performance in terms of stopping within a small radius of a target. While RL is widely used for optimal behavior in discrete tasks, a significant obstacle posed by visually guided action is the continuous state and action spaces. Our model represents continuous perceptual input (distance-to-target and velocity) and motor output (brake pressure) using tile coding for function approximation. This feature enables the model to achieve near-optimal task performance while greatly speeding learning, which occurs using the Q-learning update rule. Further, our RL model is designed to explore biologically realistic limitations on performance (e.g., perceptual noise, stimulus discriminability thresholds, and motor variability), as well as variations in reward structure. In contrast to the potentially arbitrary constraints of control law models, reinforcement learning optimally adapts behavior only to the constraints of the model's physical embodiment and the reward structure of the task. The model will be evaluated by comparing simulated data with data from experiments with human subjects performing a simulated braking task.
This PDF is available to Subscribers Only