2018
DOI: 10.1007/978-3-030-01081-2_1
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Goal Driven Autonomy

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…Goal driven autonomy or GDA (Aha et al, 2010;Klenk et al, 2013;Munoz-Avila, 2018) is a kind of goal reasoning (Aha, 2018;Roberts et al, 2018) applied to issues of robust autonomy in agent-based systems. Unlike standard autonomous systems that generate behaviors given externally provided goals or tasks, the GDA approach is to independently recognize problems that arise, explain what causes the problem, and use the explanation to formulate a goal.…”
Section: Goal-driven Autonomy (Gda)mentioning
confidence: 99%
“…Goal driven autonomy or GDA (Aha et al, 2010;Klenk et al, 2013;Munoz-Avila, 2018) is a kind of goal reasoning (Aha, 2018;Roberts et al, 2018) applied to issues of robust autonomy in agent-based systems. Unlike standard autonomous systems that generate behaviors given externally provided goals or tasks, the GDA approach is to independently recognize problems that arise, explain what causes the problem, and use the explanation to formulate a goal.…”
Section: Goal-driven Autonomy (Gda)mentioning
confidence: 99%
“…, while it has been detached from the gradients. This means that a machine learning algorithm should treat r (k) d as a non-differentiable scalar during training 2 Then, we inspect that…”
Section: Reinforcement Learning Algorithmmentioning
confidence: 99%
“…Different rollouts are required when the outcome of the game is uncertain (i.e., stochastic) [24]. 2 The reason for this treatment is because of the idea behind the chain rule: in (f g) = f g + g f , f and g in the right hand side correspond to being kept constant while the other term varies. Remark 1: The first term inside the summation in ( 14) is identical to the quantity that is derived in the policy gradient method with a reward which is independent of the parameters, i.e., the REINFORCE algorithm [7].…”
Section: Reinforcement Learning Algorithmmentioning
confidence: 99%
See 1 more Smart Citation