Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems 2007
DOI: 10.1145/1329125.1329174
|View full text |Cite
|
Sign up to set email alerts
|

Multiagent learning in adaptive dynamic systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2007
2007
2024
2024

Publication Types

Select...
2
2
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 4 publications
0
3
0
Order By: Relevance
“…Learning a model of state dynamics can result in a pre-trained hidden layer structure that reduces the training time in reinforce learning problems ( Anderson et al., 2015 ), and learning the deep Q networks from human demonstrators also helps to give a relatively good initial model and predict the dynamics ( Gabriel et al., 2019 ). There are many other applications of smart initialization on policy gradient methods ( Yun et al., 2017 ) and Q-learning methods ( Burkov and Chaib-Draa, 2007 ; Song et al., 2012 ), which speed up the learning and level up the performance ( Finn et al., 2016 ).…”
Section: Methods To Integrate Human Knowledgementioning
confidence: 99%
“…Learning a model of state dynamics can result in a pre-trained hidden layer structure that reduces the training time in reinforce learning problems ( Anderson et al., 2015 ), and learning the deep Q networks from human demonstrators also helps to give a relatively good initial model and predict the dynamics ( Gabriel et al., 2019 ). There are many other applications of smart initialization on policy gradient methods ( Yun et al., 2017 ) and Q-learning methods ( Burkov and Chaib-Draa, 2007 ; Song et al., 2012 ), which speed up the learning and level up the performance ( Finn et al., 2016 ).…”
Section: Methods To Integrate Human Knowledgementioning
confidence: 99%
“…The ADL algorithm fits multiple base classifiers to the training data during training. Each training iteration involves creating a new instance of the base classifier, fitting it to the training data, and storing the trained model [26].…”
Section: Mathematical Theory Of Adaptive Decision Learner (Adl)mentioning
confidence: 99%
“…to get desirable results [1,13,22]. Burkov and Chaib-draa [5] recently reported that mutual cooperation in PD games was realized just by using past action sequences as states of Q-learning.…”
Section: Related Workmentioning
confidence: 99%