2012
DOI: 10.1007/978-3-642-32986-9_15
|View full text |Cite
|
Sign up to set email alerts
|

Learning and Reusing Goal-Specific Policies for Goal-Driven Autonomy

Abstract: Abstract. In certain adversarial environments, reinforcement learning (RL) techniques require a prohibitively large number of episodes to learn a highperforming strategy for action selection. For example, Q-learning is particularly slow to learn a policy to win complex strategy games. We propose GRL, the first GDA system capable of learning and reusing goal-specific policies. GRL is a case-based goal-driven autonomy (GDA) agent embedded in the RL cycle. GRL acquires and reuses cases that capture episodic knowl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 11 publications
0
13
0
Order By: Relevance
“…The central concept of GDA is that goals should have an option to be updated, or reformulated, once plans go awry, based on some reasoning. While (Muñoz-Avila et al ., 2010) follow a rule-based goal formulation scheme, LGDA (Jaidee et al ., 2011a; Jaidee et al ., 2011b) learns (discrepancy, goal) pairs by experience. (Jaidee et al ., 2012) Is perhaps the most advanced GDA formulation where the agent also learns new alternative goals and goal-based policies from experience.…”
Section: Relevant Workmentioning
confidence: 99%
“…The central concept of GDA is that goals should have an option to be updated, or reformulated, once plans go awry, based on some reasoning. While (Muñoz-Avila et al ., 2010) follow a rule-based goal formulation scheme, LGDA (Jaidee et al ., 2011a; Jaidee et al ., 2011b) learns (discrepancy, goal) pairs by experience. (Jaidee et al ., 2012) Is perhaps the most advanced GDA formulation where the agent also learns new alternative goals and goal-based policies from experience.…”
Section: Relevant Workmentioning
confidence: 99%
“…GDA-C has some characteristics in common with GRL (Jaidee et al, 2012), which also uses RL for goal formulation. However, GRL is a single agent system and, unlike GDA-C, cannot scale to play complete RTS games.…”
Section: Related Workmentioning
confidence: 99%
“…To test this claim we conducted an empirical evaluation using the Wargus RTS environment in which we compared the performance of GDA-C versus CLASS QL (Jaidee et al, 2012), an ablation of GDA-C where the RL agents coordinate by sharing only the same reward function. We first compared GDA-C and CLASS QL indirectly by testing both against the built-in AI in Wargus, a proficient AI that comes with the game and is designed to be competitive versus a mid-range player.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Frequently, the expectation X is defined as the expected state. In this situation a discrepancy occurs if X  s' holds (e.g., (Jaidee et al, 2012)). In our work a discrepancy happens when the attempt to accomplish a goal fails.…”
Section: Definitionsmentioning
confidence: 99%