Abstract
For reward-seeking multitasking behaviors, humans make decisions to maximize the collective reward for several ongoing behavioral goals. Another view is that complex behaviors can be broken down into multiple modules, each of which requires specific visual information. It was shown that deep neural networks can accurately model human decisions in multitasking gaming environments (Zhang et al., 2018). However, these models provide little explanation of why a particular decision is made. Meanwhile, non-deep models such as modular reinforcement learning (MRL; Rothkopf and Ballard 2013) provide explicit and interpretable measurements of task rewards and discount rates for each behavioral goal.
For example, in a game called Freeway where players control a chicken to cross a busy highway, MRL assumes that each module (e.g., vehicles) is associated with a unique reward and discount rate. We develop a computer vision algorithm to extract relevant visual information for human decision making, such as object positions and velocities. We feed extracted information and human decision data (Zhang et al., 2019) into the MRL algorithm to estimate module rewards and discount rates.
Our results show that the MRL model is able to make human-like decisions, achieving a game score of 32 (human average: 33) and is significantly better than previous non-deep heuristic-based and RL methods (0-22.5). Furthermore, the modeling results are interpretable. The estimated rewards indicate a module’s relative importance compared to the others. The discount rates determine how much a reward decays over temporal/spatial distances for a module. We also found that using separate discount rates for different spatial dimensions is necessary: e.g., a vehicle has a high discount rate only along its heading direction, hence its reward only affects the space in front of or behind the vehicle. We conclude that MRL could be a useful model for multitasking visuomotor behaviors.