Game intelligence is an emerging hot topic in the field of artificial intelligence in recent years, and multi-agent learning is a frontier topic in the field of the intelligent game, which has a huge development prospect in all fields. This paper introduces the origin of reinforcement learning (RL) from the law of effect in animal experimental psychology and the optimization theory of optimal control. Then, the author describes the systematic composition of multi-agent reinforcement learning (MARL), and summarizes the classification of its research methods. The existing problems of MARL are discussed from three aspects: non-stationarity of the environment, partial observability, and the dimensional explosion problem. Finally, an outlook on the future is given based on the current development status of MARL and the important and difficult issues in the research field.