2021
DOI: 10.1109/access.2021.3090364
|View full text |Cite
|
Sign up to set email alerts
|

Subgoal-Based Reward Shaping to Improve Efficiency in Reinforcement Learning

Abstract: Reinforcement learning, which acquires a policy maximizing long-term rewards, has been actively studied. Unfortunately, this learning type is too slow and difficult to use in practical situations because the state-action space becomes huge in real environments. Many studies have incorporated human knowledge into reinforcement Learning. Though human knowledge on trajectories is often used, a human could be asked to control an AI agent, which can be difficult. Knowledge on subgoals may lessen this requirement be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 12 publications
0
8
0
Order By: Relevance
“…The agent of goal-conditioned RL is trained to reach the subgoals that are passed by the agent such that the learning efficiency is accelerated [22]- [25]. Some recent papers propose a multi-task learning of the agent [25], [27], [64]- [66].…”
Section: A Advantage Of the Proposed Methodsmentioning
confidence: 99%
“…The agent of goal-conditioned RL is trained to reach the subgoals that are passed by the agent such that the learning efficiency is accelerated [22]- [25]. Some recent papers propose a multi-task learning of the agent [25], [27], [64]- [66].…”
Section: A Advantage Of the Proposed Methodsmentioning
confidence: 99%
“…In this experiment, even though I used a technique for the agent to reach the target point using the sub-goals in the test experiment, I never used it in the training environment. If we can utilize the ability for reaching the agent's sub-goals, we can easily learn the agent to achieve the final goal like the previous studies [14,15,[26][27][28]. Figure 3.d shows the probability of the actions at the starting point according to the state of the sub-goals.…”
Section: Experimental Settingsmentioning
confidence: 99%
“…In addition, some studies have recently been introduced to balance the exploitation and the exploration [19][20][21][22]. Meanwhile, several studies on the agent's multi-goal have been proposed [23][24][25][26][27][28]. These studies generated the multi-goals or sub-goals to enhance the efficiency of the learning for the final goal.…”
mentioning
confidence: 99%
“…The goal-conditional RL models excel to reach the goal through the intermediate sub-goals and show excellent performance on the robotics problem [22], [23], [24], [25], [26], [27], [28], [49], [50], [51], [52], [53]. They have focused on searching for meaningful sub-goals and improving the performance of the main policy network (high-level policy network).…”
Section: Goal-conditioned Rlmentioning
confidence: 99%
“…In the goal-conditioned RL, the agent learns the subgoals, part of the trajectory of the agent [22], [23], [24], [25], [26]. By learning the sub-goals, eventually, the agent can reach the final goal.…”
Section: Introductionmentioning
confidence: 99%