In this paper, we investigate the performance enhancement of Multiple Input, Multiple Output, and Non-Orthogonal Multiple Access (MIMO-NOMA) wireless communication systems using an Artificial Intelligence (AI) based Q-Learning reinforcement learning approach. The primary challenge addressed is the optimization of power allocation in a MIMO-NOMA system, a complex task given the non-convex nature of the problem. Our proposed Q-Learning approach adaptively adjusts power allocation strategy for proximal and distant users, optimizing the trade-off between various conflicting metrics and significantly improving the system’s performance. Compared to traditional power allocation strategies, our approach showed superior performance across three principal parameters: spectral efficiency, achievable sum rate, and energy efficiency. Specifically, our methodology achieved approximately a 140% increase in the achievable sum rate and about 93% improvement in energy efficiency at a transmitted power of 20 dB while also enhancing spectral efficiency by approximately 88.6% at 30 dB transmitted Power. These results underscore the potential of reinforcement learning techniques, particularly Q-Learning, as practical solutions for complex optimization problems in wireless communication systems. Future research may investigate the inclusion of enhanced channel simulations and network limitations into the machine learning framework to assess the feasibility and resilience of such intelligent approaches.