Visual knowledge obtained from our lifelong experience of the world plays a critical role in our ability to build short-term memories. We propose a mechanistic explanation of how working memories are built from the latent representations of visual knowledge and can then be reconstructed. The proposed model, Memory for Latent Representations (MLR), features a variational autoencoder with an architecture that corresponds broadly to the human visual system and an activation-based binding pool of neurons that binds items' attributes to tokenized representations. The simulation results revealed that the shapes of familiar items can be encoded and retrieved efficiently from latents in higher levels of the visual hierarchy. On the other hand, novel patterns that are completely outside the training set can be stored from a single exposure using only latents from early layers of the visual system. Moreover, a given stimulus in working memory can have multiple codes, representing specific visual features such as shape or color, in addition to categorical information. Finally, we validated our model by testing a series of predictions against behavioral results obtained from WM tasks. The model provides a compelling demonstration of how visual knowledge yields compact visual representation for efficient memory encoding.
Visual knowledge obtained from our lifelong experience of the world plays a critical role in our ability to build short-term memories. We propose a mechanistic explanation of how working memory (WM) representations are built from the latent representations of visual knowledge and can then be reconstructed. The proposed model, Memory for Latent Representations (MLR), features a variational autoencoder with an architecture that corresponds broadly to the human visual system and an activation-based binding pool of neurons that binds items attributes to tokenized representations. The simulation results revealed that shape information for stimuli that the model was trained on, can be encoded and retrieved efficiently from latents in higher levels of the visual hierarchy. On the other hand, novel patterns that are completely outside the training set can be stored from a single exposure using only latents from early layers of the visual system. Moreover, the representation of a given stimulus can have multiple codes, representing specific visual features such as shape or color, in addition to categorical information. Finally, we validated our model by testing a series of predictions against behavioral results acquired from WM tasks. The model provides a compelling demonstration of visual knowledge yielding the formation of compact visual representation for efficient memory encoding.
To what extent does specific spatiotopic location accompany the remembered representation of a visual event? Feature integration theory suggests that identifying a multi-feature object requires focusing on its spatial location to integrate those features. Moreover, single unit data from anterior ventral stream neurons that fire preferentially to complex objects indicates that they have retinotopic receptive fields. It can, therefore, be predicted that location information of features of a complex stimulus is inherent in the memory of a perceived visual stimulus' representation. To evaluate this prediction, we presented participants with a brief array of characters with instructions to identify and locate the solitary letter among a set of digits. Surprisingly, analysis of trials in which the target identity was accurately reported indicated that in more than 15% of trials (i.e., in Experiments 2b & 2c) participants were almost completely uninformed about the location of the letter that they had just identified. Further analysis showed that there were two main sources of these location errors; misbinding the target to the distractors' locations and extremely poor spatial representation of the target's location to an extent that was indistinguishable from guessing. The latter finding indicates that consciously accessible representations of visual events can form despite being untethered to robust and spatially-specific representations, implying that the specific location was either not quite encoded into working memory, or was rapidly forgotten. However, when the target was marked by a single feature (color), there was no evidence of remembering the target identity without remembering its location even with strong masking.
To what extent does spatiotopic location accompany the representation of a visual event? Feature integration theory suggests that identifying a multi-feature object requires focus on its spatial location to integrate those features. Moreover, single unit data from neurons preferring complex objects, indicates that they have retinotopic receptive fields. It can therefore be predicted that identification of complex stimuli is contingent upon localization of their features by attention. To evaluate this, we presented participants with a brief array of characters with instructions to identify and locate the solitary letter. Surprisingly, subjects sometimes identified the target without knowing where it had been presented. However, when targets were marked by a single feature (color), there was no evidence of identifying the target without locating it also. These results indicate that consciously accessible representations of visual events can form despite being untethered to spatially specific neural activity in early visual areas.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.