2022
DOI: 10.1016/j.ins.2022.10.042
|View full text |Cite
|
Sign up to set email alerts
|

Value function factorization with dynamic weighting for deep multi-agent reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 17 publications
0
1
0
Order By: Relevance
“…𝑉 𝑖 (𝑠, πœ‹ 1 , … , πœ‹ 𝑗 , … , πœ‹ 𝑁 ) = βˆ‘ 𝑉 πœ‹ (0) 𝑁 𝑗=1 (15) In the Markov model, if the expected reward 𝑉 𝑗 (πœ‹…”
Section: F Task Planning Model Based On Markov Gamementioning
confidence: 99%
See 1 more Smart Citation
“…𝑉 𝑖 (𝑠, πœ‹ 1 , … , πœ‹ 𝑗 , … , πœ‹ 𝑁 ) = βˆ‘ 𝑉 πœ‹ (0) 𝑁 𝑗=1 (15) In the Markov model, if the expected reward 𝑉 𝑗 (πœ‹…”
Section: F Task Planning Model Based On Markov Gamementioning
confidence: 99%
“…Littman [14] introduced the MARL approach in the 1990s, using Markov Decision Processes (MDP) as the framework for environment modeling. MARL provides a mathematical framework for solving various reinforcement learning problems and has become the foundation for subsequent studies [15] .…”
Section: Introductionmentioning
confidence: 99%
“…MARL can be used to optimize collaborative detection. Compared with RL, one agent's action will also become the state of other agents after it takes actions according to the strategy in MARL, which will lead to complex state space and convergence difficulty [25]. One of the important research contents of MARL is how to optimize the behavior of each agent without explicit communication among agents.…”
Section: Introductionmentioning
confidence: 99%
“…However, to create functionalities with the help of machine learning and deep learning for build such complex and hard-coded systems. In this path planner, reinforcement learning algorithms are used to determine the optimal control sequence on the interaction with the environment during learning, a technique within machine learning [2]. This proposed method is based on reinforcement learning algorithms to model the "ships' motion", and "turning rate".…”
Section: Introductionmentioning
confidence: 99%