We present the issue of monetarily incentivized forwarding in a multi-hop mesh network architecture from an economic perspective. It is anticipated that credit-incentivized forwarding and relaying will be a simple method of exchanging transmission power and spectrum for connectivity. However, gateways and forwarding nodes, like any other free market, may create an oligopolistic market for the users they serve. In this study, a coalition scheme between buyers aims to address price control by gateways or nodes closer to gateways. In a Stackelberg competition game, buyer agents (users) and sellers (gateways) make decisions using reinforcement learning (RL), with decentralized Deep Q-Networks to buy and sell forwarding resources. We allow communication links between the buyers with a limited messaging space, without defining a collusion mechanism. The idea is to demonstrate that through messaging, and RL tacit collusion can emerge between agents in a decentralized setup. The multi-agent reinforcement learning (MARL) system is presented and analyzed from a machine-learning perspective. Moreover, MARL dynamics are discussed via mean field analysis to better understand divergence causes and make implementation recommendations for such systems. Finally, the simulation results show the results of coordination among the users.