2021
DOI: 10.3390/make3030029
|View full text |Cite
|
Sign up to set email alerts
|

Recent Advances in Deep Reinforcement Learning Applications for Solving Partially Observable Markov Decision Processes (POMDP) Problems: Part 1—Fundamentals and Applications in Games, Robotics and Natural Language Processing

Abstract: The first part of a two-part series of papers provides a survey on recent advances in Deep Reinforcement Learning (DRL) applications for solving partially observable Markov decision processes (POMDP) problems. Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment. The fact that the agent has limited access to the information of the environment enables AI to be applied efficiently in most fi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(19 citation statements)
references
References 114 publications
0
19
0
Order By: Relevance
“…Table 6 shows that the average score for volume B of the translating ability exam is 26.70 and the average score for volume B is 25. 14. is shows that after three months of translation education experiment, the average value of experimental translation ability is 1.56 points higher, with little difference.…”
Section: Resultsmentioning
confidence: 91%
See 1 more Smart Citation
“…Table 6 shows that the average score for volume B of the translating ability exam is 26.70 and the average score for volume B is 25. 14. is shows that after three months of translation education experiment, the average value of experimental translation ability is 1.56 points higher, with little difference.…”
Section: Resultsmentioning
confidence: 91%
“…e experiment shows that POA is a typical application based on design research, and that it can be extended to other educational contexts and to beginners by expanding its scope, geographical area, and target audience. [14] reflects on the motivating, enabling, and evaluating aspects of POA. In contrast, [15] argues that the design of the three components of POA promotes students' curiosity and weakens teachers' role of "scaffolding," which has a positive effect on students' independent learning.…”
Section: Related Workmentioning
confidence: 99%
“…Although it is a simple method, it can cause some impact on the performance by incorrectly processing nouns due to the lack of rigor in the processing of noun predicates. Statistically based, various features are set in the Co Nll2008 corpus, such as Xiang and Foo using the same qualities in predicate recognition and word sense credit and getting better results [20]. For example, Reisman et al proposed more and more valuable features to expand further the feature set of predicate recognition, which led to a significant improvement in the performance of predicate recognition [21].…”
Section: Related Workmentioning
confidence: 99%
“…e Google DeepMind institute proposed deep Q-learning (DQN) [43]. It uses a deep neural network (DNN) instead of the original Q-value table to approximate the Q-function and trains it through the squared error:…”
Section: Intelligent Routing Algorithm Based Onmentioning
confidence: 99%