The massive scale of future wireless networks will create computational bottlenecks in performance optimization. In this paper, we study the problem of connecting mobile traffic to Cloud RAN (C-RAN) stations. To balance station load, we steer the traffic by designing device association rules. The baseline association rule connects each device to the station with the strongest signal, which does not account for interference or traffic hot spots, and leads to load imbalances and performance deterioration. Instead, we can formulate an optimization problem to decide centrally the best association rule at each time instance. However, in practice this optimization has such high dimensions, that even linear programming solvers fail to solve. To address the challenge of massive connectivity, we propose an approach based on the theory of optimal transport, which studies the economical transfer of probability between two distributions. Our proposed methodology can further inspire scalable algorithms for massive optimization problems in wireless networks.