2019
DOI: 10.1109/twc.2019.2919611
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning for Self Organization and Power Control of Two-Tier Heterogeneous Networks

Abstract: Self-organizing networks (SONs) can help manage the severe interference in dense heterogeneous networks (HetNets). Given their need to automatically configure power and other settings, machine learning is a promising tool for data-driven decision making in SONs. In this paper, a HetNet is modeled as a dense two-tier network with conventional macrocells overlaid with denser small cells (e.g. femto or pico cells). First, a distributed framework based on multi-agent Markov decision process is proposed that models… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
98
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 76 publications
(99 citation statements)
references
References 39 publications
1
98
0
Order By: Relevance
“…where r t,m = r m (s t , a t ). When the kth SU that selects the mth channel and the CS m = 0, the kth SU should update its all Q-values according to (9) at an information exchange frame. At an general frame, the update method of SUs is the same as that in the independent Q-learning method.…”
Section: Distributed Collaborative Q-learning-based Subcarrier Assmentioning
confidence: 99%
See 1 more Smart Citation
“…where r t,m = r m (s t , a t ). When the kth SU that selects the mth channel and the CS m = 0, the kth SU should update its all Q-values according to (9) at an information exchange frame. At an general frame, the update method of SUs is the same as that in the independent Q-learning method.…”
Section: Distributed Collaborative Q-learning-based Subcarrier Assmentioning
confidence: 99%
“…Since it dose not need to model the environment, Q-learning has been used in conventional networks to solve the resource allocation problems, including power control and subcarrier assignment problems. In [9], Q-learning has been applied for power optimization in heterogeneous wireless networks while the authors of [10] have proposed the power scheme based on Qlearning in dynamic NOMA transmission game and used Dyna architecture and hotbooting techniques to achieve a faster learning speed. To solve the channel assignment problem, in [11] and [12], schemes based on Q-learning for spectrum access in cellular networks were proposed.…”
mentioning
confidence: 99%
“…In Lashgari et al, a pricing‐based power allocation scheme was proposed to mitigate the cross‐tier interference and enhance the energy efficiency in HCNs. In order to maximize the throughput of small cells, joint subchannel and power allocation problem was fully addressed in literature with consideration of cluster‐based interference mitigation, local channel state information (CSI) limitation, and self‐organizing mechanism . Different from the previous works, Huang et al jointly considered the user association and power control in HCNs by formulating an utility‐energy maximization problem.…”
Section: Introductionmentioning
confidence: 99%
“…As the AI technologies are developing with a very high speed in the recent years, some learning methods are introduced to solve some complicated optimization problems. As shown in [ 4 , 5 , 6 , 7 , 8 , 9 ], model-free RL methods can be an efficient way to solve the energy efficiency optimization problem of HetNets, since the precise model process was not necessary. In [ 4 , 5 ], the Actor–Critic (AC) algorithm was applied to optimize energy efficiency of HetNets while the authors did not conduct in-depth research on the selections of basis functions which are challenging for the application of RL.…”
Section: Introductionmentioning
confidence: 99%
“…In [ 4 , 5 ], the Actor–Critic (AC) algorithm was applied to optimize energy efficiency of HetNets while the authors did not conduct in-depth research on the selections of basis functions which are challenging for the application of RL. Roohollah et al [ 6 ] introduced a Q-Learning (QL) based distributed power allocation algorithm (Q-DPA) as a self-organizing mechanism to solve the power optimization problem in the networks. In [ 7 ], a method based on QL was proposed to solve the energy efficiency and delay problems of smart grid data transmission in HetNets, in which, however, the dimension of action and state space was too large.…”
Section: Introductionmentioning
confidence: 99%