a b s t r a c tAlready in the 1930s Skinner, Konorskiand colleagues debated the commonalities, differences and interactions among the processes underlying what was then known as "conditioned reflexes type I and II", but which is today more well-known as classical (Pavlovian) and operant (instrumental) conditioning. Subsequent decades of research have confirmed that the interactions between the various learning systems engaged during operant conditioning are complex and difficult to disentangle. Today, modern neurobiological tools allow us to dissect the biological processes underlying operant conditioning and study their interactions. These processes include initiating spontaneous behavioral variability, world-learning and self-learning. The data suggest that behavioral variability is generated actively by the brain, rather than as a by-product of a complex, noisy input-output system. The function of this variability, in part, is to detect how the environment responds to such actions. World-learning denotes the biological process by which value is assigned to environmental stimuli. Self-learning is the biological process which assigns value to a specific action or movement. In an operant learning situation using visual stimuli for flies, world-learning inhibits self-learning via a prominent neuropil region, the mushroom-bodies. Only extended training can overcome this inhibition and lead to habit formation by engaging the self-learning mechanism. Self-learning transforms spontaneous, flexible actions into stereotyped, habitual responses.