2016
DOI: 10.1177/0142331215581638
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement learning analysis for a minimum time balance problem

Abstract: Reinforcement learning was developed to solve complex learning control problems, where only a minimal amount of a priori knowledge exists about the system dynamics. It has also been used as a model of cognitive learning in humans and applied to systems, such as pole balancing and humanoid robots, to study embodied cognition. However, closed-form analysis of the value function learning based on a higher-order unstable test problem dynamics has been rarely considered. In this paper, firstly, a second-order, unst… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(18 citation statements)
references
References 43 publications
0
18
0
Order By: Relevance
“…However, the proposed method performs higher accuracy and precision than the method with fixed resolution 640 × 480. Furthermore, the time consumption per image of the proposed method is less than that of conventional methods (Figure 7(b)), which is 44.1% of the method with fixed resolution 640 × 480 and 28.1% of the method with fixed resolution 1280 × 960. GBVS model and dense SIFT algorithm cost the majority of the whole time, and the higher the resolution, the more time they cost.…”
Section: Experiments and Resultsmentioning
confidence: 92%
See 1 more Smart Citation
“…However, the proposed method performs higher accuracy and precision than the method with fixed resolution 640 × 480. Furthermore, the time consumption per image of the proposed method is less than that of conventional methods (Figure 7(b)), which is 44.1% of the method with fixed resolution 640 × 480 and 28.1% of the method with fixed resolution 1280 × 960. GBVS model and dense SIFT algorithm cost the majority of the whole time, and the higher the resolution, the more time they cost.…”
Section: Experiments and Resultsmentioning
confidence: 92%
“…Therefore, such situations result in low speed of image processing. Tutsoy and Brown 7 prove that a learning/classification algorithm with optimized control policy can precisely produce the same result as the analytical solution with the amount of data reduced. Besides, processing images with high resolution for object detection and recognition is not economical, because interesting objects do not always show up in FOV.…”
Section: Introductionmentioning
confidence: 92%
“…A concrete test problem with a closed-form solution is a good way to evaluate the performance of algorithms in detail (Tutsoy and Brown, 2016a , b ). Here, we used the existing and proposed algorithms to solve a simple problem.…”
Section: Methodsmentioning
confidence: 99%
“…A few works [46,47] have also explored ideas from Probably Approximately Correct (PAC) learning framework to reuse previously discarded experiences intelligently to accelerate the value function learning. Recently, in [40], the authors have analyzed the rate of parameter convergence for RL algorithms in presence of unstable system dynamics and random exploration noise, thus showing the significant potential of accelerating the learning process.…”
Section: Introductionmentioning
confidence: 99%