2021
DOI: 10.48550/arxiv.2109.15175
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Coordinated Reinforcement Learning for Optimizing Mobile Networks

Abstract: Mobile networks are composed of many base stations and for each of them many parameters must be optimized to provide good services. Automatically and dynamically optimizing all these entities is challenging as they are sensitive to variations in the environment and can affect each other through interferences. Reinforcement learning (RL) algorithms are good candidates to automatically learn base station configuration strategies from incoming data but they are often hard to scale to many agents. In this work, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 17 publications
0
6
0
Order By: Relevance
“…The authors in Ref. [65] treat the coverage and capacity objectives as black-box functions, with no analytical formula and no gradient observations. They identified the set of Pareto optimal solutions through Bayesian optimization (BO) and the deep deterministic policy gradient algorithm (DDPG).…”
Section: Learning-based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The authors in Ref. [65] treat the coverage and capacity objectives as black-box functions, with no analytical formula and no gradient observations. They identified the set of Pareto optimal solutions through Bayesian optimization (BO) and the deep deterministic policy gradient algorithm (DDPG).…”
Section: Learning-based Methodsmentioning
confidence: 99%
“…In addition to MAB, reinforcement learning is also widely used in online network optimization [64][65][66][67][68] . Dandarov et al [64] proposed an RL approach awarded by the sum data rate normalized to the sector capacity and the number of satisfied users normalized to the potential total number of served users to address CCO problem.…”
Section: Learning-based Methodsmentioning
confidence: 99%
“…The model takes as input a feature vector for each agent to control along with a graph representation of the mobile network. The construction of such graph is described in the previous section and has been also demonstrated in previous work [10], [11]. Each agent is represented by a node in the graph.…”
Section: Graph Q-networkmentioning
confidence: 98%
“…However, they also require an ad-hoc engineering of the reward and can only control one base station at a time [10]. Other algorithms attempting to address the global network optimization problem have been proposed in previous works using coordination graphs [11]. This solution also required a heuristic to handle credit assignment between base stations by splitting individual rewards across neighbors.…”
Section: Related Workmentioning
confidence: 99%
“…Cooperative multi-agent reinforcement learning (MARL) where a team of agents learn coordinated policies optimizing global team rewards has been extensively studied in recent years [25,13], and find potential applications in a wide variety of domains like robot swarm control [15,2], coordinating autonomous drivers [26,41], network routing [38,4], etc. Although cooperative MARL problems can be framed as a centralized single-agent, with the team as that actor with the joint action space, such an approach doesn't scale well.…”
Section: Introductionmentioning
confidence: 99%