2016
DOI: 10.1155/2016/4824072
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

Abstract: To improve the convergence rate and the sample efficiency, two efficient learning methods AC-HMLP and RAC-HMLP (AC-HMLP with ℓ 2-regularization) are proposed by combining actor-critic algorithm with hierarchical model learning and planning. The hierarchical models consisting of the local and the global models, which are learned at the same time during learning of the value function and the policy, are approximated by local linear regression (LLR) and linear function approximation (LFA), respectively. Both the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…In this process, VM is an agent and server is the environment, where VM takes an action by interacting with the server at each cycle. MDP can be represented as four-tuple ðS, A, P, RÞ, and these are described as follows [22]: S representing as state space: s t ∊S denotes the state of the server at time period t.…”
Section: Makespan Timementioning
confidence: 99%
“…In this process, VM is an agent and server is the environment, where VM takes an action by interacting with the server at each cycle. MDP can be represented as four-tuple ðS, A, P, RÞ, and these are described as follows [22]: S representing as state space: s t ∊S denotes the state of the server at time period t.…”
Section: Makespan Timementioning
confidence: 99%