2020
DOI: 10.1109/lcomm.2020.2990308
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian Reinforcement Learning for Link-Level Throughput Maximization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 4 publications
0
2
0
Order By: Relevance
“…The obtained action assigns a subset of users to be served by the HAPS and the remaining ones are served by the TBS. We use (18) to obtain the reward value Rps t , a t q. Afterward, the time-varying channels evolve based on (28).…”
Section: B Our Proposed Dsrl Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The obtained action assigns a subset of users to be served by the HAPS and the remaining ones are served by the TBS. We use (18) to obtain the reward value Rps t , a t q. Afterward, the time-varying channels evolve based on (28).…”
Section: B Our Proposed Dsrl Methodsmentioning
confidence: 99%
“…Further, the authors perform user association to maximize the total throughput of the network while avoiding frequent handoffs resulting from the mobility of airborne vehicles. The authors of [18] studied the DRL-based user association in a VHetNet with en emphasize on the role of satellites. In [19], the authors study minimizing the age of information (AoI) in an intelligent transportation system (ITS) where UAVs collect the produced information by sensors on the vehicles to provide up-to-date data.…”
Section: Related Workmentioning
confidence: 99%
“…In [7], the authors proposed an interference coordination method for an integrated HAPS-terrestrial network that considered traffic load distribution between HAPS and terrestrial network. In [8], the authors developed a user association scheme for an integrated HAPS-terrestrial network based on the DQL approach considering delayed channel state information (CSI). In [9], we developed an iterative algorithm to design the subcarrier and transmit power allocation to user equipment (UEs) to handle the interference in vHetNets.…”
Section: Introductionmentioning
confidence: 99%
“…While receiving a scalar reward signal, a reinforcement learning agent interacts with the environment dynamically and adjusts its policy based on the data collected to maximize the expected return. Researchers have applied this paradigm to various scenarios, including robots Kober et al (2013); Donepudi (2020), manmachine games Silver et al (2016), instructional systems Iglesias et al (2009), recommendation systems Zhao et al (2018); Afsar et al (2021), resource management Mao et al (2016); Wang and Pedram (2016); Liessner et al (2019); Khoshkbari et al (2020a), medical diagnosis Yu et al (2021) and intelligent transportation Tang et al (2019); Haydari and Yilmaz (2022). As a result, solving decision-making problems in a variety of areas drives the study and implementation of reinforcement learning theories.…”
Section: Introductionmentioning
confidence: 99%