2011
DOI: 10.1007/s10462-011-9224-z
|View full text |Cite
|
Sign up to set email alerts
|

New prioritized value iteration for Markov decision processes

Abstract: Abstract:The problem of solving large Markov decision processes accurately and quickly is challenging. Since the computational effort incurred is considerable, current research focuses on finding superior acceleration techniques. For instance, the convergence properties of current solution methods depend, to a great extent, on the order of backup operations. On one hand, algorithms such as topological sorting are able to find good orderings but their overhead is usually high. On the other hand, shortest path m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0
2

Year Published

2015
2015
2018
2018

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 20 publications
0
1
0
2
Order By: Relevance
“…Euler models Scientific computation [104] Statistical methods [105] [ 106,154] Graph theory Automata [124] Graph/complex network analysis [37,[108][109][110][111][112][113][114][115][116][117] Engineering methods…”
Section: Computational Sciencementioning
confidence: 99%
“…Euler models Scientific computation [104] Statistical methods [105] [ 106,154] Graph theory Automata [124] Graph/complex network analysis [37,[108][109][110][111][112][113][114][115][116][117] Engineering methods…”
Section: Computational Sciencementioning
confidence: 99%
“…Este enfoque aplica una reglamentación de acciones y un nuevo enfoque de priorización de estados en el algoritmo de iteración de valor. Al enfoque propuesto se le denominó IPVI (Improved Prioritized Value Iteration) [Garcia-Hernandez, 2012a]. Para verificar la robustez del enfoque propuesto, en este capítulo se presenta su implementación en el simulador de planificación de movimientos robóticos (SPRM) de Reyes et al [Reyes, 2006b], en una tarea compleja de ruta estocástica más corta, la cual a continuación se describe.…”
Section: Preparación Del Ambiente De Pruebaunclassified
“…− el clásico algoritmo de iteración de valor (VI) [Puterman, 1994], − el algoritmo de iteración de valor con reglamentación de acciones (ARVI) [Garcia-Hernandez, 2009 [Dibangoye, 2008], − el algoritmo propuesto en esta tesis (IPVI) [Garcia-Hernandez, 2012a].…”
Section: Conclusiones Del Capítulounclassified