2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) 2018
DOI: 10.1109/spawc.2018.8445899
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Dynamic Proactive Caching Via Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 12 publications
0
10
0
Order By: Relevance
“…To model the time-varying nature of task popularity, the popularity profile t is modeled by a V -state Markov chain [22] , represented by V different popularity profiles .1/ ; : : : ; .V / . Each profile is modeled by Zipf distributions with parameters Á v .…”
Section: Task Modelmentioning
confidence: 99%
See 3 more Smart Citations
“…To model the time-varying nature of task popularity, the popularity profile t is modeled by a V -state Markov chain [22] , represented by V different popularity profiles .1/ ; : : : ; .V / . Each profile is modeled by Zipf distributions with parameters Á v .…”
Section: Task Modelmentioning
confidence: 99%
“…The objective function in P1 is modified by using Eqs. (17) and (22) under the RL framework. The constraints in P1 are used during the training of DRL framework to determine whether to adopt or discard a certain learning result.…”
Section: Drl-based Solution With Ddpgmentioning
confidence: 99%
See 2 more Smart Citations
“…Denoting the long-term popularity of the content as p := E[r t ], using the expressions for the optimal actions in (13a)-(13d), and leveraging the independence among r t , λ t , and ρ t , the expected cost-to-go function can be readily derived as in (14)- (16). The expectation in (14) is w.r.t.…”
Section: Value Function In Closed Formmentioning
confidence: 99%