Abstract-This paper focuses on a heterogeneous scenario in which cellular and wireless local area technologies coexist and in which mobile devices are enabled with device-to-device communication capabilities. In this context, this paper assumes a network architecture in which a given user equipment (UE) can receive mobile service either by connecting directly to a cellular base station or by connecting through another UE that acts as an access point and relays the traffic from a cellular base station. The paper investigates the optimization of the connectivity of different UEs with the target to minimize the total transmission power. An optimization framework is presented, and a distributed strategy based on Q-learning and softmax decision making is proposed as a means to solve the considered problem with reduced complexity. The proposed strategy is evaluated under different conditions, and it is shown that the strategy achieves a performance very close to the optimum. Moreover, significant transmission power reductions of approximately 40% are obtained with respect to the classical approach, in which all UEs are connected to the cellular infrastructure. For multi-cell scenarios, in which the optimum solution cannot be easily known a priori, the proposed approach is compared against a centralized genetic algorithm. The proposed approach achieves similar performance in terms of total transmitted power, while exhibiting much lower computational requirements.