Building Collaboration in Multi-agent Systems Using Reinforcement Learning

Aydın, Mehmet Emin; Fellows, Ryan

doi:10.1007/978-3-319-98446-9_19

Cited by 11 publications

(7 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Developing collective behavior automatically for a group of robots is challenging (Francesca and Birattari, 2016 ). Although there are some existing works which simulate swarming collective behavior in robots, including collective navigation for robots (Na et al, 2022 ), collaborative robots (Aydin and Fellows, 2018 ), and collective formation of robots (Buffet et al, 2007 ), none of these can automatically generate a diverse set of collective behaviors. One limitation in doing this is that automatic recognition of swarming collective motion behavior is hard (Harvey et al, 2018 ).…”

Section: Background and Related Workmentioning

confidence: 99%

Iterative transfer learning for automatic collective motion tuning on multiple robot platforms

2023

View full text Add to dashboard Cite

This paper proposes an iterative transfer learning approach to achieve swarming collective motion in groups of mobile robots. By applying transfer learning, a deep learner capable of recognizing swarming collective motion can use its knowledge to tune stable collective motion behaviors across multiple robot platforms. The transfer learner requires only a small set of initial training data from each robot platform, and this data can be collected from random movements. The transfer learner then progressively updates its own knowledge base with an iterative approach. This transfer learning eliminates the cost of extensive training data collection and the risk of trial-and-error learning on robot hardware. We test this approach on two robot platforms: simulated Pioneer 3DX robots and real Sphero BOLT robots. The transfer learning approach enables both platforms to automatically tune stable collective behaviors. Using the knowledge-base library the tuning procedure is fast and accurate. We demonstrate that these tuned behaviors can be used for typical multi-robot tasks such as coverage, even though they are not specifically designed for coverage tasks.

show abstract

Section: Background and Related Workmentioning

confidence: 99%

Iterative transfer learning for automatic collective motion tuning on multiple robot platforms

2023

View full text Add to dashboard Cite

show abstract

“…ToM-net [41] captures mental states of other agents and predicts their future action. OM [42] uses agent policies to predict the intended actions of opponents. But all these works are conducted under a competitive setting and require agents to infer each other's intention, which could be inaccurate considering the instability nature of MARL [43].…”

Section: Intention Modelingmentioning

confidence: 99%

Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance

Tang

Ren

et al. 2023

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

Order execution is a fundamental task in quantitative finance, aiming at finishing acquisition or liquidation for a number of trading orders of the specific assets. Recent advance in model-free reinforcement learning (RL) provides a data-driven solution to the order execution problem. However, the existing works always optimize execution for an individual order, overlooking the practice that multiple orders are specified to execute simultaneously, resulting in suboptimality and bias. In this paper, we first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints. Specifically, we treat every agent as an individual operator to trade one specific order, while keeping communicating with each other and collaborating for maximizing the overall profits. Nevertheless, the existing MARL algorithms often incorporate communication among agents by exchanging only the information of their partial observations, which is inefficient in complicated financial market. To improve collaboration, we then propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other and refining accordingly. It is optimized through a novel action value attribution method which is provably consistent with the original learning objective yet more efficient. The experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness achieved by our method.

show abstract

“…Supervised learning is also used in works that incorporate Theory of Mind, Section II-D, which equips agents with a prediction module to estimate other agents' beliefs and future actions. In these cases, SL can be used to predict actions given current observations [38], [59], [94], [95] or coupled with the obverter technique to influence policy based on an agent's own understanding [96], [97].…”

Section: B: Gumbel Soft-max and Concrete Distributionmentioning

confidence: 99%

“…Alternatively, mental models can also be based on other agents' actions and perceptions without assuming similar belief systems. For instance, [94] augments agents' policy with predictions of other agents' behavior and demonstrates that agents can learn better policies using their estimates of other players' goals in cooperative and competitive situations. However, this work does not consider environments where communication is present.…”

Section: ) Modeling Agentsmentioning

confidence: 99%

Toward More Human-Like AI Communication: A Review of Emergent Communication Research

Brandizzi

2023

IEEE Access

View full text Add to dashboard Cite

In the recent shift towards human-centric AI, the need for machines to accurately use natural language has become increasingly important. While a common approach to achieve this is to train large language models, this method presents a form of learning misalignment where the model may not capture the underlying structure and reasoning humans employ in using natural language, potentially leading to unexpected or unreliable behavior. Emergent communication (EmCom) is a field of research that has seen a growing number of publications in recent years, aiming to develop artificial agents capable of using natural language in a way that goes beyond simple discriminative tasks and can effectively communicate and learn new concepts. In this review, we present EmCom under two aspects. Firstly, we delineate all the common proprieties we find across the literature and how they relate to human interactions. Secondly, we identify two subcategories and highlight their characteristics and open challenges. We encourage researchers to work together by demonstrating that different methods can be viewed as diverse solutions to a common problem and emphasize the importance of including diverse perspectives and expertise in the field. We believe a deeper understanding of human communication and human-AI trust dynamics are crucial to develop machines that can accurately use natural language in human-machine interactions.

show abstract

Building Collaboration in Multi-agent Systems Using Reinforcement Learning

Cited by 11 publications

References 28 publications

Iterative transfer learning for automatic collective motion tuning on multiple robot platforms

Iterative transfer learning for automatic collective motion tuning on multiple robot platforms

Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance

Toward More Human-Like AI Communication: A Review of Emergent Communication Research

Contact Info

Product

Resources

About