Path integral policy improvement (PI 2 ) is a datadriven method for solving stochastic optimal control problems. Both feedforward and feedback controls are calculated based on a sample of noisy open-loop trajectories of the system and their costs, which can be obtained in a highly parallelizable manner. The control strategy offers theoretical performance guarantees related to the expected cost achieved by the resulting closed-loop system. This paper extends the single-agent case to a multi-agent setting, where such theoretical guarantees have not been attained previously. We provide both a decentralized and a leader-follower scheme for distributing the feedback calculations under different communication constraints. The theoretical results are verified numerically through simulations.