Millimeter-wave (mmWave) massive multiple-input multiple-output non-orthogonal multiple access (MIMO-NOMA) is proven to be a primary technique for sixth-generation (6G) wireless communication networks. However, the great increase in users and antennas brings challenges for interference suppression and resource allocation for mmWave massive MIMO-NOMA systems. This study proposes a spectrum-efficient and fast convergence deep reinforcement learning (DRL)-based resource allocation framework to optimize user grouping and allocation of subchannel and power. First, an enhanced K-means grouping algorithm is proposed to reduce the multi-user interference and accelerate the convergence. Then, a dueling deep Q-network (DQN) structure is proposed to perform subchannel allocation, which further improves the convergence speed. Moreover, a deep deterministic policy gradient (DDPG)-based power resource allocation algorithm is designed to avoid the performance loss caused by power quantization and improve the system’s achievable sum-rate. The simulation results demonstrate that our proposed scheme outperforms other neural network-based algorithms in terms of convergence performance, and can achieve higher system capacity compared with the greedy algorithm, the random algorithm, the RNN algorithm, and the DoubleDQN algorithm.