The temporal-difference (TD) algorithm from reinforcement learning provides a simple method for incrementally learning predictions of upcoming events. Applied to classical conditioning, TD models suppose that animals learn a real-time prediction of the unconditioned stimulus (US) on the basis of all available conditioned stimuli (CSs). In the TD model, similar to other error-correction models, learning is driven by prediction errors-the difference between the change in US prediction and the actual US. With the TD model, however, learning occurs continuously from moment to moment and is not artificially constrained to occur in trials. Accordingly, a key feature of any TD model is the assumption about the representation of a CS on a moment-to-moment basis. Here, we evaluate the performance of the TD model with a heretofore unexplored range of classical conditioning tasks. To do so, we consider three stimulus representations that vary in their degree of temporal generalization and evaluate how the representation influences the performance of the TD model on these conditioning tasks.Keywords Associative learning . Classical conditioning . Timing . Reinforcement learning Classical conditioning is the process of learning to predict the future. The temporal-difference (TD) algorithm is an incremental method for learning predictions about impending outcomes that has been used widely, under the label of reinforcement learning, in artificial intelligence and robotics for real-time learning (Sutton & Barto, 1998). In this article, we evaluate a computational model of classical conditioning based on this TD algorithm. As applied to classical conditioning, the TD model supposes that animals use the conditioned stimulus (CS) to predict in real time the upcoming unconditioned stimuli (US) (Sutton & Barto, 1990). The TD model of conditioning has become the leading explanation for conditioning in neuroscience, due to the correspondence between the phasic firing of dopamine neurons and the reward-prediction error that drives learning in the model (Schultz, Dayan, & Montague, 1997; for reviews, see Ludvig, Bellemare, & Pearson, 2011;Maia, 2009;Niv, 2009;Schultz, 2006).The TD model can be viewed as an extension of the Rescorla-Wagner (RW) learning model, with two additional twists (Rescorla & Wagner, 1972). First, the TD model makes real-time predictions at each moment in a trial, thereby allowing the model to potentially deal with intratrial effects, such as the effects of stimulus timing on learning and the timing of responses within a trial. Second, the TD algorithm uses a slightly different learning rule with important implications. As will be detailed below, at each time step, the TD algorithm compares the current prediction about future US occurrences with the US predictions generated on the last time step. This temporal difference in US prediction is compared with any actual US received; if the latter two quantities differ, a prediction error is generated. This prediction error is then used to alter the associative ...