Deep Reinforcement Learning 2020
DOI: 10.1007/978-981-15-4095-0_10
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Reinforcement Learning

Abstract: The needs describe the necessities for a self-organizing system to survive and evolve, which arouses an agent to action toward a goal, giving purpose and direction to behavior. Based on Maslow's hierarchy of needs, an agent needs to satisfy a certain amount of needs at the current level as a condition to arise at the next stage -upgrade and evolution. Especially, Deep Reinforcement Learning (DAL) can help AI agents (like robots) organize and optimize their behaviors and strategies to develop diverse Strategies… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(20 citation statements)
references
References 33 publications
0
20
0
Order By: Relevance
“…We perform experiments in CLEVR-Robot environment [Jiang et al, 2019], as shown in Figure 3. CLEVR-Robot is an environment for object interaction based on the MuJoCo physics engine [Todorov et al, 2012].…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…We perform experiments in CLEVR-Robot environment [Jiang et al, 2019], as shown in Figure 3. CLEVR-Robot is an environment for object interaction based on the MuJoCo physics engine [Todorov et al, 2012].…”
Section: Methodsmentioning
confidence: 99%
“…Besides, the instruction-following policy has close ties to Hierarchical RL [Barto and Mahadevan, 2003] because the instructions can be naturally viewed as a task abstraction for a low-level policy [Blukis et al, 2022]. HAL [Jiang et al, 2019] takes advantage of the compositional structure of NL and makes decisions directly at the NL level to solve long-term, complex RL tasks. These previous methods either expose the unbounded NL instructions directly to the policy or encode the NL instructions to a scenario-specific manual vector, both of which have limitations.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations