Teams of artificially intelligent planetary rovers have tremendous potential for space exploration, allowing for reduced cost, increased flexibility and increased reliability. However, having these multiple autonomous devices acting simultaneously leads to a problem of coordination: to achieve the best results, the they should work together. This is not a simple task. Due to the large distances and harsh environments, a rover must be able to perform a wide variety of tasks with a wide variety of potential teammates in uncertain and unsafe environments. Directly coding all the necessary rules that can reliably handle all of this coordination and uncertainty is problematic. Instead, this article examines tackling this problem through the use of coordinated reinforcement learning: rather than being programmed what to do, the rovers iteratively learn through trial and error to take take actions that lead to high overall system return. To allow for coordination, yet allow each agent to learn and act independently, we employ state-of-the-art reward shaping techniques. The article uses visualization techniques to break down complex performance indicators into an accessible form, and identifies key future research directions.
In multi-objective problems, it is desirable to use a fast algorithm that gains coverage over large parts of the Pareto front. The simplest multi-objective method is a linear combination of objectives given to a single-objective optimizer. However, it is proven that this method cannot support solutions on the concave areas of the Pareto front: one of the points on the convex parts of the Pareto front or an extreme solution is always more desirable to an optimizer. This is a significant drawback of the linear combination. In this work we provide the Pareto Concavity Elimination Transformation (PaC-cET), a novel, iterative objective space transformation that allows a linear combination (in this transformed objective space) to find solutions on concave areas of the Pareto front (in the original objective space). The transformation ensures that an optimizer will always value a non-dominated solution over any dominated solution, and can be used by any single-objective optimizer. We demonstrate the efficacy of this method in two multi-objective benchmark problems with known concave Pareto fronts. Instead of the poor coverage created by a simple linear sum, PaCcET produces a superior spread across the Pareto front, including concave areas, similar to those discovered by more computationally-expensive multiobjective algorithms like SPEA2 and NSGA-II.
Multiagent systems have had a powerful impact on the real world. Many of the systems it studies (air traffic, satellite coordination, rover exploration) are inherently multi-objective, but they are often treated as single-objective problems within the research. A very important concept within multiagent systems is that of credit assignment: clearly quantifying an individual agent's impact on the overall system performance. In this work we extend the concept of credit assignment into multi-objective problems, broadening the traditional multiagent learning framework to account for multiple objectives. We show in two domains that by leveraging established credit assignment principles in a multi-objective setting, we can improve performance by (i) increasing learning speed by up to 10x (ii) reducing sensitivity to unmodeled disturbances by up to 98.4% and (iii) producing solutions that dominate all solutions discovered by a traditional teambased credit assignment schema. Our results suggest that in a multiagent multiobjective problem, proper credit assignment is as important to performance as the choice of multi-objective algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.