2024
DOI: 10.1109/tnnls.2023.3236361
|View full text |Cite
|
Sign up to set email alerts
|

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
20
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(20 citation statements)
references
References 0 publications
0
20
0
Order By: Relevance
“…The observations can be broken down into five parts: ball information, left team data, right team data, details of the controlled player, and the match state. We employ settings similar to those used in [9], where agents can choose from 19 discrete actions, including running, sliding, shooting, and passing. We evaluate five different random seeds and average the results for presentation.…”
Section: The Performance In Grf Tasksmentioning
confidence: 99%
“…The observations can be broken down into five parts: ball information, left team data, right team data, details of the controlled player, and the match state. We employ settings similar to those used in [9], where agents can choose from 19 discrete actions, including running, sliding, shooting, and passing. We evaluate five different random seeds and average the results for presentation.…”
Section: The Performance In Grf Tasksmentioning
confidence: 99%
“…Generally, rewards from the environment are extrinsic rewards, while those awarded by the exploration mechanism are intrinsic rewards. The various approaches to exploration algorithm design can be broadly categorized into four types [24]. The first is count-based exploration, where states with a higher visit count are less novel.…”
Section: Introductionmentioning
confidence: 99%
“…Traditional approaches to robot navigation exploration usually rely on prior information about the environment to design control strategies [3]. However, this kind of approach often performs poorly when dealing with unknown environments, such as dynamic changes and complex obstacles.…”
Section: Introductionmentioning
confidence: 99%