NOMA and MIMO are considered to be the promising technologies to meet huge access demands and high data rate requirements of 5G wireless networks. In this paper, the power allocation problem in a downlink MIMO-NOMA system to maximize the energy efficiency while ensuring the quality-of-service of all users is investigated. Two deep reinforcement learning-based frameworks are proposed to solve this non-convex and dynamic optimization problem, referred to as the multi-agent DDPG/TD3-based power allocation framework. In particular, with current channel conditions as input, every single agent of two multi-agent frameworks dynamically outputs the optimum power allocation policy for all users in every cluster by DDPG/TD3 algorithm, and the additional actor network is also added to the conventional multi-agent model in order to adjust power volumes allocated to clusters to improve overall performance of the system. Finally, both frameworks adjust the entire power allocation policy by updating the weights of neural networks according to the feedback of the system. Simulation results show that the proposed multi-agent deep reinforcement learning based power allocation frameworks can significantly improve the energy efficiency of the MIMO-NOMA system under various transmit power limitations and minimum data rates compared with other approaches, including the performance comparison over MIMO-OMA.