2017
DOI: 10.1016/j.ifacol.2017.08.802
|View full text |Cite
|
Sign up to set email alerts
|

Learning Suboptimal Broadcasting Intervals in Multi-Agent Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
24
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(25 citation statements)
references
References 7 publications
1
24
0
Order By: Relevance
“…The sufficient conditions in [10, Section 4.1] typically entail the existence of a directed spanning tree (see [2, 22]), but one can envision less challenging/appealing MAS scenarios in which this directed spanning tree requirement is superfluous. In addition, since we are not interested exclusively in a single multi‐agent differential graphical game, [10, Section 4.1] imposes no requirements on graph edge weights found in [36]. Accordingly, subsets of our agents can effectively be involved in distinct multi‐agent differential graphical games.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…The sufficient conditions in [10, Section 4.1] typically entail the existence of a directed spanning tree (see [2, 22]), but one can envision less challenging/appealing MAS scenarios in which this directed spanning tree requirement is superfluous. In addition, since we are not interested exclusively in a single multi‐agent differential graphical game, [10, Section 4.1] imposes no requirements on graph edge weights found in [36]. Accordingly, subsets of our agents can effectively be involved in distinct multi‐agent differential graphical games.…”
Section: Methodsmentioning
confidence: 99%
“…Remark 5. The benefits of LSPI include implementation simplicity due to linear-in-parameter approximators, which in turn strips the algorithm down to a set of linear matrix equations (10). Accordingly, the principal numerical concern lies in the need for matrix inversion, which is a well-understood nuisance circumvented, among others, via the LU factorization [51].…”
Section: Learning Near-optimal Broadcasting Intervalsmentioning
confidence: 99%
See 2 more Smart Citations
“…These works typically consider offline algorithms with perfect model knowledge and state information so that closed-loop stability during learning is not a relevant concern. Somewhat reversed approaches, in which the authors start off with robust stability (e.g., Lp-stability) and employ RL towards (sub)optimality, are also of interest ( Anderson et al, 2007;Friedrich & Buss, 2017;Kretchmar et al, 2001;Toli ć & Palunko, 2017 ). Such approaches follow the control philosophy of trading off optimality for stability.…”
Section: Stability Considerations For Reinforcement Learningmentioning
confidence: 99%