2022
DOI: 10.3389/fnbot.2022.1012427
|View full text |Cite
|
Sign up to set email alerts
|

An immediate-return reinforcement learning for the atypical Markov decision processes

Abstract: The atypical Markov decision processes (MDPs) are decision-making for maximizing the immediate returns in only one state transition. Many complex dynamic problems can be regarded as the atypical MDPs, e.g., football trajectory control, approximations of the compound Poincaré maps, and parameter identification. However, existing deep reinforcement learning (RL) algorithms are designed to maximize long-term returns, causing a waste of computing resources when applied in the atypical MDPs. These existing algorith… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 25 publications
0
1
0
Order By: Relevance
“…The basic concept of DRL is reinforcement learning (RL), which is part of machine learning. RL is based on the Markov decision process (MDP), which is a decision-making model in random situations represented by agents as decision-makers [30]. There are five elements in the MDP framework: agent, environment, state (S), action (A), and reward (R).…”
Section: Deep Reinforcement Learning (Drl)mentioning
confidence: 99%
“…The basic concept of DRL is reinforcement learning (RL), which is part of machine learning. RL is based on the Markov decision process (MDP), which is a decision-making model in random situations represented by agents as decision-makers [30]. There are five elements in the MDP framework: agent, environment, state (S), action (A), and reward (R).…”
Section: Deep Reinforcement Learning (Drl)mentioning
confidence: 99%