2020 Applying New Technology in Green Buildings (ATiGB) 2021
DOI: 10.1109/atigb50996.2021.9423319
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Objective Exploration for Proximal Policy Optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…It has effectively addressed long-standing challenges in reinforcement learning, such as sparse and deceptive rewards. Go-Explore has emerged as a potent agent trainer in Atari games, and this research suggests extending its use to multi-agent scenarios by equipping each agent with a Go-Explore's critic network [4]. On the other hand, the MADDPG stands as an improved version of the Deep Deterministic Policy Gradient (DDPG) algorithm, designed to enhance rewards in collaborative-competitive environments [5].…”
Section: Introductionmentioning
confidence: 99%
“…It has effectively addressed long-standing challenges in reinforcement learning, such as sparse and deceptive rewards. Go-Explore has emerged as a potent agent trainer in Atari games, and this research suggests extending its use to multi-agent scenarios by equipping each agent with a Go-Explore's critic network [4]. On the other hand, the MADDPG stands as an improved version of the Deep Deterministic Policy Gradient (DDPG) algorithm, designed to enhance rewards in collaborative-competitive environments [5].…”
Section: Introductionmentioning
confidence: 99%
“…In [31], agents were trained using a mixture of observations from different training environments and linearity constraints were imposed on both the observation interpolations and the supervision (e.g., associated reward) interpolations. In the study described in [32], multiple targets were simultaneously optimized to enhance the PPO.…”
Section: Introductionmentioning
confidence: 99%