DOI: 10.26481/dis.20121217mk
|View full text |Cite
|
Sign up to set email alerts
|

Learning against learning : evolutionary dynamics of reinforcement learning algorithms in strategic interactions

Abstract: People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 80 publications
0
5
0
Order By: Relevance
“…It is concluded from ( 44) and ( 45) that, when Similarly, for Agent u,v,2 , based on [44], [46], we can derive…”
Section: Theoremmentioning
confidence: 96%
“…It is concluded from ( 44) and ( 45) that, when Similarly, for Agent u,v,2 , based on [44], [46], we can derive…”
Section: Theoremmentioning
confidence: 96%
“…However, the factors affecting the converged bidding price are implicitly indicated therein. In [5], the numerical connection between EGT with RDEs and some baseline MARL algorithms is proved, implying that EGT with RDEs can explicitly reveal the factors affecting the converged result in MARL. Thus, in this letter, the correlation between WoLF-PHC and EGT is investigated and adopted to analyse the learning dynamics.…”
Section: Introductionmentioning
confidence: 93%
“…For EPs, the state refers to [𝜆 , , , 𝑃 , , ], the action is [ 𝑃 , , 𝜆 , ]. The EGT with RDEs, instead, presents the change of probability of multiple "players" selecting different "strategies", and these players will imitate the strategy of those who obtain the largest "payoff" [5]. Empirically, the strategy can be considered as the principle of selecting actions.…”
Section: A Connections Between Marl and Egtmentioning
confidence: 99%
See 2 more Smart Citations