2019 IEEE Global Communications Conference (GLOBECOM) 2019
DOI: 10.1109/globecom38437.2019.9013134
|View full text |Cite
|
Sign up to set email alerts
|

Distributed Multi-Hop Traffic Engineering via Stochastic Policy Gradient Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…In [10], the authors presented a distributed multi-agent based TE solution to learn the optimal routing policy for the multi-path routing. The authors formulated the multi-path routing problem as a multi-agent Markov decision problem and proposed a multi-agent actor-critic technique where each router functions as a local actor and critic.…”
Section: Minimize E2e Delay and Efficiently Load Balancingmentioning
confidence: 99%
“…In [10], the authors presented a distributed multi-agent based TE solution to learn the optimal routing policy for the multi-path routing. The authors formulated the multi-path routing problem as a multi-agent Markov decision problem and proposed a multi-agent actor-critic technique where each router functions as a local actor and critic.…”
Section: Minimize E2e Delay and Efficiently Load Balancingmentioning
confidence: 99%
“…QFLOW [25] , LEARNET [26] , multi-hop routing [27] , SDN RL [28] , TCP-DRINC [29] QFLOW [25] is a platform for RL-based edge network configuration that uses queuing, learning, and scheduling to meet the quatily of experience of video streaming applications.…”
Section: Network Traffic Controlmentioning
confidence: 99%
“…LEARNET [26] makes use of RL for flow control in timesensitive deterministic networks. In multi-hop routing [27] , a distributed model-free solution based on stochastic policy gradient RL was proposed, which aims to minimize the E2E delay by allowing each router to send a packet to the next-hop router according to the learned optimal probability.…”
Section: Network Traffic Controlmentioning
confidence: 99%
“…The generic framework is composed of a universal agentbased interaction environment, which is an expansive buildup system that pools together and supports the implementation of a diversity of various algorithms [163][164] [165]. The universal agent-based interaction environment is devised to support the use of various DL algorithmic techniques such as deep Q-learning and double Q-learning.…”
Section: Rt+1mentioning
confidence: 99%