Sutton, Szepesvári and Maei (2009) recently introduced the first temporal-difference learning algorithm compatible with both linear function approximation and off-policy training, and whose complexity scales only linearly in the size of the function approximator. Although their gradient temporal difference (GTD) algorithm converges reliably, it can be very slow compared to conventional linear TD (on on-policy problems where TD is convergent), calling into question its practical utility. In this paper we introduce two new related algorithms with better convergence rates. The first algorithm, GTD2, is derived and proved convergent just as GTD was, but uses a different objective function and converges significantly faster (but still not as fast as conventional TD). The second new algorithm, linear TD with gradient correction, or TDC, uses the same update rule as conventional TD except for an additional term which is initially zero. In our experiments on small test problems and in a Computer Go application with a million features, the learning rate of this algorithm was comparable to that of conventional TD. This algorithm appears to extend linear TD to off-policy learning with no penalty in performance while only doubling computational requirements.
Although the hippocampus plays a crucial role in the formation of spatial memories, as these memories mature they may become additionally (or even exclusively) dependent on extrahippocampal structures. However, the identity of these extrahippocampal structures that support remote spatial memory is currently not known. Using a Morris water-maze task, we show that the anterior cingulate cortex (ACC) plays a key role in the expression of remote spatial memories in mice. To first evaluate whether the ACC is activated after the recall of spatial memory, we examined the expression of the immediate early gene, c-fos, in the ACC. Fos expression was elevated after expression of a remote (1 month old), but not recent (1 d old), water-maze memory, suggesting that ACC plays an increasingly important role as a function of time. Consistent with the gene expression data, targeted pharmacological inactivation of the ACC with the sodium channel blocker lidocaine blocked expression of remote, but spared recent, spatial memory. In contrast, inactivation of the dorsal hippocampus disrupted expression of spatial memory, regardless of its age. We further showed that inactivation of the ACC blocked expression of remote spatial memory in two different mouse strains, after training with either a hidden or visible platform in a constant location, and using the AMPA receptor antagonist CNQX. Together, our data provide evidence that circuits supporting spatial memory are reorganized in a time-dependent manner, and establish that activity in neurons intrinsic to the ACC is critical for processing remote spatial memories.
The water maze is commonly used to assay spatial cognition, or, more generally, learning and memory in experimental rodent models. In the water maze, mice or rats are trained to navigate to a platform located below the water's surface. Spatial learning is then typically assessed in a probe test, where the platform is removed from the pool and the mouse or rat is allowed to search for it. Performance in the probe test may then be evaluated using either occupancy-based (percent time in a virtual quadrant [Q] or zone [Z] centered on former platform location), error-based (mean proximity to former platform location [P]) or counting-based (platform crossings [X]) measures. While these measures differ in their popularity, whether they differ in their ability to detect group differences is not known. To address this question we compiled five separate databases, containing more than 1600 mouse probe tests. Random selection of individual trials from respective databases then allowed us to simulate experiments with varying sample and effect sizes. Using this Monte Carlo-based method, we found that the P measure consistently outperformed the Q, Z and X measures in its ability to detect group differences. This was the case regardless of sample or effect size, and using both parametric and non-parametric statistical analyses. The relative superiority of P over other commonly used measures suggests that it is the most appropriate measure to employ in both low- and high-throughput water maze screens.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.