A major amount of the energy of battery-powered sensors is spent during packet transmissions. This issue has led to the development of power-control-based multiple-access collision avoidance (MACA) protocols that can reduce the packet transmission power and conserve energy. However, the reduction in transmission power renders the packets susceptible to collisions. To reduce these collisions while maintaining high energy efficiency, we propose a power control protocol that utilizes reinforcement learning to choose the optimal transmission power. The total reward is determined by the occurrence of a collision, amount of transmission power used, frequency of DATA packet retransmissions, and update of the interference range. A key feature of the proposed protocol is that it enables sensors to prevent collisions without any prior knowledge of interferences, thus eliminating the need for additional signaling. Further, simulation results reveal that compared with benchmark protocols such as collision avoidance power control media access control (MAC), two-level power control MAC, and MACA-based power control MAC protocols, the proposed protocol improves network throughput while minimizing the network energy consumption and collisions per packet. The efficacy of the proposed protocol is demonstrated through results obtained using varying average traffic loads.INDEX TERMS collisions, interference, medium access control, power control, Q-learning, underwater acoustic sensor networks.