Abstract-The smart grid communication network adopts a hierarchical structure which consists of three kinds of networks which are Home Area Networks (HANs), Neighborhood Area Networks (NANs), and Wide Area Networks (WANs). The smart grid NANs comprise of the communication infrastructure used to manage the electricity distribution to the end users. Cellular technology with LTE-based standards is a widely-used and forward-looking technology hence becomes a promising technology that can meet the requirements of different applications in NANs. However, the LTE has a limitation to cope with the data traffic characteristics of smart grid applications, thus require for enhancements. Device-to-Device (D2D) communications enable direct data transmissions between devices by exploiting the cellular resources, which could guarantee the improvement of LTE performances. Delay is one of the important communication requirements for the real-time smart grid applications. In this paper, the application of D2D communications for the smart grid NANs is investigated to improve the average end-to-end delay of the system. A relay selection algorithm that considers both the queue state and the channel state of nodes is proposed. The optimization problem is formulated as a constrained Markov decision process (CMDP) and a linear programming method is used to find the optimal policy for the CMDP problem. Simulation results are presented to prove the effectiveness of the proposed scheme.