We consider a cellular network where mobile transceiver devices that are owned by self-interested users are incentivized to cooperate with each other using tokens, which they exchange electronically to "buy" and "sell" downlink relay services, thereby increasing the network's capacity compared to a network that only supports base station-to-device (B2D) communications. We investigate how an individual device in the network can learn its optimal cooperation policy online, which it uses to decide whether or not to provide downlink relay services for other devices in exchange for tokens. We propose a supervised learning algorithm that devices can deploy to learn their optimal cooperation strategies online given their experienced network environment. We then systematically evaluate the learning algorithm in various deployment scenarios. Our simulation results suggest that devices have the greatest incentive to cooperate when the network contains (i) many devices with high energy budgets for relaying, (ii) many highly mobile users (e.g., users in motor vehicles), and (iii) neither too few nor too many tokens. Additionally, within the token system, self-interested devices can effectively learn to cooperate online, and achieve up to 20% higher throughput on average than with B2D communications alone, all while selfishly maximizing their own utilities.