Multiple unmanned aerial vehicle (multi-UAV) cooperative air combat, which is an important form of future air combat, has high requirements for the autonomy and cooperation of unmanned aerial vehicles. Therefore, it is of great significance to study the decision-making method of multi-UAV cooperative air combat since the conventional methods are challenging to solve the high complexity and highly dynamic cooperative air combat problems. This paper proposes a multi-agent double-soft actor-critic (MADSAC) algorithm for solving the cooperative decision-making problem of multi-UAV. The MADSAC achieves multi-UAV cooperative air combat by treating the problem as a fully cooperative game using a decentralized partially observable Markov decision process and a centrally trained distributed execution framework. The use of maximum entropy theory in the update process makes the method more exploratory. Meanwhile, MADSAC uses double-centralized critics, target networks, and delayed policy updates to solve the overestimation and error accumulation problems effectively. In addition, the double-centralized critics based on the attention mechanism improve the scalability and learning efficiency of MADSAC. Finally, multi-UAV cooperative air combat experiments validate the effectiveness of MADSAC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.