2022
DOI: 10.1609/icaps.v32i1.19854
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Agent Tree Search with Dynamic Reward Shaping

Abstract: Sparse rewards and their representation in multi-agent domains remains a challenge for the development of multi-agent planning systems. While techniques from formal methods can be adopted to represent the underlying planning objectives, their use in facilitating and accelerating learning has witnessed limited attention in multi-agent settings. Reward shaping methods that leverage such formal representations in single-agent settings are typically static in the sense that the artificial rewards remain the same t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 15 publications
0
0
0
Order By: Relevance
“…Transfer learning between the two different environments with the same objective is also eased with this approach. Authors in [69], extended the work by introducing Multiagent Tree Search Algorithm with reward shaping(MATS-A) so that it can be applied to multi-agent scenario and can handle both stochastic and deterministic transition in Multi-agent Non-Markovian Reward Decision Process. They prove that sharing the same search tree and DFA objective can be used to develop competitive and cooperative behavior among the agents, within and across the team.Research work [70] first converts an omega-regular specification into a Buchi automaton.…”
Section: B Reasoning For Learning Rl Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…Transfer learning between the two different environments with the same objective is also eased with this approach. Authors in [69], extended the work by introducing Multiagent Tree Search Algorithm with reward shaping(MATS-A) so that it can be applied to multi-agent scenario and can handle both stochastic and deterministic transition in Multi-agent Non-Markovian Reward Decision Process. They prove that sharing the same search tree and DFA objective can be used to develop competitive and cooperative behavior among the agents, within and across the team.Research work [70] first converts an omega-regular specification into a Buchi automaton.…”
Section: B Reasoning For Learning Rl Modelmentioning
confidence: 99%
“…Neurosymbolic RL has been applied to different components of the RL framework, combining symbolic reasoning with neural networks to solve complex RL problems. It has been successful in addressing the issue of sparse rewards by formulating reward functions that provide more informative feedback to the agent [68], [69], [70]. It has also been used to learn programmatic policies that are more generalizable and flexible to different environments [71], [72], [73], [74].…”
Section: E Optimizing Parameters Of Rlmentioning
confidence: 99%