This paper develops a relative output-feedback-based solution to the containment control of linear heterogeneous multiagent systems. A distributed optimal control protocol is presented for the followers to not only assure that their outputs fall into the convex hull of the leaders' output but also optimizes their transient performance. The proposed optimal solution is composed of a feedback part, depending of the followers' state, and a feed-forward part, depending on the convex hull of the leaders' state. To comply with most real-world applications, the feedback and feed-forward states are assumed to be unavailable and are estimated using two distributed observers. That is, a distributed observer is designed to measure each agent's states using only its relative output measurements and the information that it receives by its neighbors. Another adaptive distributed observer is designed, which uses exchange of information between followers over a communication network to estimate the convex hull of the leaders' state.The proposed observer relaxes the restrictive requirement of having access to the complete knowledge of the leaders' dynamics by all the followers. An off-policy reinforcement learning algorithm on an actor-critic structure is next developed to solve the optimal containment control problem online, using relative output measurements and without requiring the leaders' dynamics. Finally, the theoretical results are verified by numerical simulations.
KEYWORDSadaptive distributed observer, cooperative output regulation, optimal control, output containment control, reinforcement learning 262