Structureless communications such as Device-to-Device (D2D) relaying are undeniably of paramount importance to improving the performance of today’s mobile networks. Such a communication paradigm requires implementing a certain level of intelligence at device level, allowing to interact with the environment and select proper decisions. However, decentralizing decision making sometimes may induce some paradoxical outcomes resulting, therefore, in a performance drop, which sustains the design of self-organizing, yet efficient systems. Here, each device decides either to directly connect to the eNodeB or get access via another device through a D2D link. Given the set of active devices and the channel model, we derive the outage probability for both cellular link and D2D link, and compute the system throughput. We capture the device behavior using a biform game perspective. In the first part of this article, we analyze the pure and mixed Nash equilibria of the induced game where each device seeks to maximize its own throughput. Our framework allows us to analyse and predict the system’s performance. The second part of this article is devoted to implement two Reinforcement Learning (RL) algorithms enabling devices to self-organize themselves and learn their equilibrium pure/mixed strategies, in a fully distributed fashion. Simulation results show that offloading the network by means of D2D-relaying improves per device throughput. Moreover, detailed analysis on how the network parameters affect the global performance is provided.