In day-to-day life, we often must choose between pursuing familiar behaviors or adjusting behaviors when new strategies might be more fruitful. The dorsomedial striatum (DMS) is indispensable for arbitrating between old and new action strategies. To uncover molecular mechanisms, we trained mice to generate nose poke responses for food, then uncoupled the predictive relationship between one action and its outcome. We then bred the mice that failed to rapidly modify responding. This breeding created offspring with the same tendencies, failing to inhibit behaviors that were not reinforced. These mice had less post-synaptic density protein 95 in the DMS. Also, densities of the melanocortin-4 receptor (MC4R), a high-affinity receptor for α-melanocyte-stimulating hormone, predicted individuals’ response strategies. Specifically, high MC4R levels were associated with poor response inhibition. We next found that reducing Mc4r in the DMS in otherwise typical mice expedited response inhibition, allowing mice to modify behavior when rewards were unavailable or lost value. This process required inputs from the orbitofrontal cortex, a brain region canonically associated with response strategy switching. Thus, MC4R in the DMS appears to propel reward-seeking behavior, even when it is not fruitful, while moderating MC4R presence increases the capacity of mice to inhibit such behaviors.