The flow of reward information through neuronal ensembles in the nucleus accumbens shell (NAcSh) and its influence on decision-making remain poorly understood. We investigated these questions by training rats in a self-guided probabilistic choice task while recording single-unit activity in the NAcSh. We found that rats dynamically adapted their choices based on an internal representation of reward likelihood. NAcSh neurons encoded multiple task variables, including choices, outcomes (reward/no reward), and licking behavior. These neurons also exhibited sequential activity patterns resembling waves that peaked and dissipated with outcome delivery, potentially reflecting a global brain wave passing through the NAcSh. Further analysis revealed distinct neuronal ensembles processing distinct aspects of reward-guided behavior, further organized into four functionally specialized meta-ensembles. A Markov random fields graphical model revealed that NAcSh neurons form a small-world network with a heavy-tailed distribution, where most neurons have few functional connections and rare hubs are highly connected. This network architecture allows for efficient and robust information transmission. Neuronal ensembles exhibited dynamic interactions that reorganize depending on reward outcomes. Reinforcement learning within the session led to neuronal ensemble merging and increased network synchronization during reward delivery compared to omission. These findings offer a novel perspective of the flow of pleasure throughout neuronal ensembles in the NAcSh that dynamically changes its composition, with neurons dropping in and out, as the rat learns to obtain (energy) rewards in a changing environment and supports the idea that NAcSh ensembles encode the outcome of actions to guide decision making.