“…An emerging framework describes this phenomenon as a simulation-driven estimation process, examining what might result from each available action by consulting memories of similar previous settings. This approach, generally referred to as memory sampling (Bordalo et al, 2020;Gershman & Daw, 2017;Kuwabara & Pillemer, 2010;Lengyel & Dayan, 2008;Lieder et al, 2018;Ritter et al, 2018;Shadlen & Shohamy, 2016;Zhao et al, 2019) , can approximate the sorts of option value estimates that would be learned across repeated experience by, e.g., temporal-difference reinforcement learning (TDRL; (Gershman & Daw, 2017;Lengyel & Dayan, 2008) ), while retaining the flexibility to diverge from long-run averages when doing so may be adaptive. At one extreme, drawing on individual memories in this way allows one to effectively tackle choice problems even in the low-data limit (e.g., in novel environments), where processes that rely on abstraction over multiple experiences are unreliable (Lengyel & Dayan, 2008) .…”