In the past decade, the global distribution of energy resources has expanded significantly. The increasing number of prosumers creates the prospect for a more decentralized and accessible energy market, where the peer-to-peer energy trading paradigm emerges. This paper proposes a methodology to optimize the participation in peer-to-peer markets based on the double-auction trading mechanism. This novel methodology is based on two reinforcement learning algorithms, used separately, to optimize the amount of energy to be transacted and the price to pay/charge for the purchase/sale of energy. The proposed methodology uses a competitive approach, and that is why all agents seek the best result for themselves, which in this case means reducing as much as possible the costs related to the purchase of energy, or if we are talking about sellers, maximizing profits. The proposed methodology was integrated into an agent-based ecosystem where there is a direct connection with agents, thus allowing application to real contexts in a more efficient way. To test the methodology, a case study was carried out in an energy community of 50 players, where each of the proposed models were used in 20 different players, and 10 were left without training. The players with training managed, over the course of a week, to save 44.65 EUR when compared to a week of peer-to-peer without training, a positive result, while the players who were left without training increased costs by 17.07 EUR.