Abstract
Visual knowledge is an inseparable part of working memory (WM) that affects encoding of information (Baddeley & Hitch, 1974; Baddeley, 2000; Ericsson & Kintsch, 1995). For example, the number of familiar items that people can remember is more than unfamiliar items (Zimmer & Fischer; 2020). And, active long-term memory is known to be a key component of WM (Cowan 2017; Oberauer 2009). Yet, no computational model has explained the underlying mechanism between visual knowledge and WM. In this study, we used a generative deep learning network to build a neurally-plausible computational model of WM that we call it TLDR. The visual knowledge in the TLDR is represented by a modified Variational Autoencoder (VAE; Kingma & Welling, 2013) that compresses the visual information in multiple layers. The TLDR encodes visual information by flexibly allocating neural resources to create an actively stored representation in a binding pool (BP; Swan & Wyble, 2014). The stored information in the BP is then retrieved to recreate those latent representations that were generated at the time that the stimuli were perceived. Consistent with human behavior in memory tasks, the TLDR is capable of explaining the following aspects of WM: efficient storage of familiar shapes with the aid of visual knowledge, storing novel configurations (Lake et al., 2011), encoding relevant attributes (e.g., color, shape, etc.) with varying degrees of precision (Swan, Collins, & Wyble, 2016), storing categorical information along with visual details with minimal interference, showing interference when multiple items or attributes are encoded in one memory trace, and rapid tuning the encoding parameters to accomplish unexpected memory tasks. In general, the TLDR provides new insights on the representations of WM in relation to the visual knowledge.