2022
DOI: 10.48550/arxiv.2205.00399
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning user-defined sub-goals using memory editing in reinforcement learning

Abstract: The aim of reinforcement learning (RL) is to allow the agent to achieve the final goal. Most RL studies have focused on improving the efficiency of learning to achieve the final goal faster. However, the RL model is very difficult to modify an intermediate route in the process of reaching the final goal. That is, the agent cannot be under control to achieve other sub-goals in the existing studies. If the agent can go through the sub-goals on the way to the destination, the RL can be applied and studied in vari… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…However, the agent of the sub-goals dedicated network did not show the shortest path whereas the agent of the policy network showed the almost shortest path. Furthermore, like a previous study [21], sometimes, the agent that learned the sub-goals was confused when the bonus point was near the current location of the agent. The reason is that the agent was trained to go through the bonus point, at first, to clear each stage.…”
Section: Key-door Domainmentioning
confidence: 50%
See 3 more Smart Citations
“…However, the agent of the sub-goals dedicated network did not show the shortest path whereas the agent of the policy network showed the almost shortest path. Furthermore, like a previous study [21], sometimes, the agent that learned the sub-goals was confused when the bonus point was near the current location of the agent. The reason is that the agent was trained to go through the bonus point, at first, to clear each stage.…”
Section: Key-door Domainmentioning
confidence: 50%
“…In addition, learning user-defined sub-goals has been proposed [21]. However, the agent was partially under control.…”
Section: Path Planningmentioning
confidence: 99%
See 2 more Smart Citations
“…By editing the collected trajectories, we can obtain the number of trajectories. A recent study used the concept of memory editing on goal-conditioned RL for the agent to reach subgoals so that the user can control the agent by the subgoals [32]. However, the agent could not move the subgoals in difficult environments such as Fig.…”
Section: Introductionmentioning
confidence: 99%