Recent advances in research on the Multi-agent System (MAS) optimal control issue will help sectors like robotics, communications, and power systems. This work looks at the intelligent design of a large-scale multi-pursuer and multi-evader pursuit-evasion game. Based on reinforcement learning, a distributed cooperative pursuit method with communication is created. The famed Curse of Dimensionality poses a serious danger to multi-player pursuit-evasion game designs due to the sheer number of agents, especially in hostile areas where there aren't many communication options available to encourage player information exchange. In order to find the best pursuit-evasion strategies using a novel type of probability density function (PDF) rather than exhaustive data from all the remaining teams or agents, the Mean Field Games (MFG) theory has been used. A novel MAS optimum type oversight system with a decentralised and computer-friendly decision method is urgently needed. Mean field game theory is used to create the Actor-critic-mass (ACM), a decentralised optimal control system, to address the aforementioned issues. Additionally, the homogeneous decentralised Actor-critic-mass (HDACM) which improves the ACM method, does away with restrictions like homogeneous agents and cost functions. Finally, two applications make use of the PAS algorithm.