DOI: 10.22215/etd/2016-11321
|View full text |Cite
|
Sign up to set email alerts
|

Contributions to Techniques for Learning Non-Reactive Behaviour from Observation

Abstract: Learning from observation allows an expert to train a software agent or robot without explicitly programming the behaviour. Behaviour learned can be broken down into categories: reactive and non-reactive. Actions in reactive behaviour are based on the current environment state whereas actions in non-reactive uses both current state, and any past action or states. We will analyze and compare using a common benchmark two recent and partially studied approaches to learning non-reactive behaviour from observation:… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(7 citation statements)
references
References 7 publications
0
7
0
Order By: Relevance
“…Once the framework was unified, one of the next steps that was accomplished was to identify the reason for the poor performance of the DBN in learning statebased behavior in the vacuum cleaner domain in [8]. We realized that the DBN learner would store the memory of the action it last performed, and use it to predict the action at the next time step.…”
Section: Design Methodologymentioning
confidence: 99%
See 4 more Smart Citations
“…Once the framework was unified, one of the next steps that was accomplished was to identify the reason for the poor performance of the DBN in learning statebased behavior in the vacuum cleaner domain in [8]. We realized that the DBN learner would store the memory of the action it last performed, and use it to predict the action at the next time step.…”
Section: Design Methodologymentioning
confidence: 99%
“…To ensure that the agent and expert are always on the same trajectory, the agent's action is intercepted and replaced by the expert action. We found that the DBN agent was saving the action that it performed, and this was the cause for the poor reproduction of the DBN learner's results in the vacuum cleaner domain in [8]. Once the fix specified above was implemented, the results that were produced matched those of the original paper and can be seen in [9].…”
Section: )mentioning
confidence: 99%
See 3 more Smart Citations