Recently, there is the widespread use of mobile devices and sensors, and rapid emergence of new wireless and networking technologies, such as wireless sensor network, device-to-device (D2D) communication, and vehicular ad hoc networks. These networks are expected to achieve a considerable increase in data rates, coverage, and the number of connected devices with a significant reduction in latency and energy consumption. Because there are energy resource constraints in user's devices and sensors, the problem of wireless network resource allocation becomes much more challenging. This leads to the call for more advanced techniques in order to achieve a tradeoff between energy consumption and network performance. In this paper, we propose to use reinforcement learning, an efficient simulation-based optimization framework, to tackle this problem so that user experience is maximized. Our main contribution is to propose a novel non-cooperative and real-time approach based on deep reinforcement learning to deal with the energy-efficient power allocation problem while still satisfying the quality of service constraints in D2D communication.INDEX TERMS Energy efficient wireless communication, power allocation, D2D communication, multiagent reinforcement learning, deep reinforcement learning.