Difficulty in cessation of drinking, smoking, or gambling, even with strong intention, has been widely recognized. Reasons for this, and whether there are reasons common to substance and non-substance reward, remain elusive. We present a computational model of common potential mechanisms underlying the difficulty in resisting habitual behavior to obtain reward. Consider that a person has long been regularly taking a series of actions leading to a purchase of alcohol, cigarette, or betting ticket without any hesitation. Referring to the recently suggested representation of states by their successors in human reinforcement learning as well as the dimension reduction in state representations in the brain, we assumed that the person has acquired a rigid representation of states along the series of habitual actions by the discounted future occupancy of the final successor state, namely, the rewarded goal state, under the established non-resistant policy. Then, we show that if the person takes a different policy to resist temptation of habitual behavior, negative reward prediction error (RPE) is generated when s/he makes “No-Go” decisions whereas no RPE occurs upon “Go” decisions, and a large positive RPE is generated upon eventually reaching the goal, given that the state representation acquired under the non-resistant policy is so rigid that it does not easily change. In the cases where the states are instead represented in the punctate manner or by the discounted future occupancies of all the states (i.e., by the genuine successor representation), negative and positive RPEs are generated upon “No-Go” and “Go” decisions, respectively, whereas no or little RPE occurs at the goal. We suggest that these RPEs, especially the large positive RPE generated upon goal reaching in the case with the goal-based reduced successor representation, might underlie the difficulty in cessation of undesired habitual or addictive behavior to obtain substance and non-substance reward.Author SummaryMany people try to stop drinking, smoking, or gambling, but fail it. Why? In case of drinking or smoking, alcohol or nicotine could invade the brain and affect the neural circuits. But such substance-based explanations obviously do not hold for gambling or video gaming. A conceivable explanation, common for substance and non-substance, is that such behavior has become a habit, which is so rigidly established that it could not be changed. However, it has been shown that even those who are suffering from severe drug addiction can behave in a goal-directed, rather than habitual, manner in the sense that they can exhibit intact sensitivity to changes in the value of action outcomes. Meanwhile, recent work suggests that humans may develop a subtler “habit”, where a particular type of internal representation of states (situations), rather than action itself, becomes rigidly formed. Here we show, through computational modeling, that if a similar type of, but dimension-reduced, state representation is formed, when people try to resist long-standing reward-obtaining behavior but eventually fail and reach the rewarded goal, a large positive “prediction error” of rewards would arise, and discuss that it might underlie the difficulty in cessation of undesired habitual behavior.