Ronald Parr scite author profile

This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the representation size of structured MDPs, but the complexity of exact solution algorithms for such MDPs can grow exponentially in the representation size. In this paper, we present two approximate solution algorithms that exploit structure in factored MDPs. Both use an approximate value function represented as a linear combination of basis functions, where each basis function involves only a small subset of the domain variables. A key contribution of this paper is that it shows how the basic operations of both algorithms can be performed efficiently in closed form, by exploiting both additive and context-specific structure in a factored MDP. A central element of our algorithms is a novel linear program decomposition technique, analogous to variable elimination in Bayesian networks, which reduces an exponentially large LP to a provably equivalent, polynomial-sized one. One algorithm uses approximate linear programming, and the second approximate dynamic programming. Our dynamic programming algorithm is novel in that it uses an approximation based on max-norm, a technique that more directly minimizes the terms that appear in error bounds for approximate MDP algorithms. We provide experimental results on problems with over 10 40 states, demonstrating a promising indication of the scalability of our approach, and compare our algorithm to an existing state-of-the-art approach, showing, in some problems, exponential gains in computation time.

show abstract

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Parr

Taylor

et al. 2008

107

View full text Add to dashboard Cite

We show that linear value-function approximation is equivalent to a form of linear model approximation. We then derive a relationship between the model-approximation error and the Bellman error, and show how this relationship can guide feature selection for model improvement and/or value-function improvement. We also show how these results give insight into the behavior of existing feature-selection algorithms.

show abstract

Analyzing feature generation for value-function approximation

Parr

Painter-Wakefield

et al. 2007

View full text Add to dashboard Cite

show abstract

Dp-Slam 2.0

Eliazar

Parr

2004

111

View full text Add to dashboard Cite

Absfrtrcf-Probabilistic appmaches have proved very succecs-ambiguities as a natural part of the particle filtering process ful at addressing the basic problem of robot lofabtion and and obviates the explicit loop closing phase needed for other mapping and they have shown great promise on the combined approaches [61, 181, problem of simultaneous localization and mapping (SLAM). One appmch to relatively spane, relatively unamEfficiently maintaining the joint distribution required some higuous landmark builds a Kfilter over landmark significant data suucture engineering because maps are large positions. Other approaches assume dense sensor date which objects and a naive particle filter implementation would entail individually are not very distinctive, such as those available from huge of block memory copying as map hypotheses a laser range finder' In earlier work* we prerent* an algorithm progress through the particle filter. By maintaining a tree repcalled DP-SLAM, which provided a very accurate solution to the oyer resentation of where the paths of.different particles diverged, robot maps and poses. The approach assumed rn extremely . we exploited redundancies between the maps. As originally accurate lsser range finder and a deterministic environment. implemented, we were able to achieve a runtime which was edSe by emdenfly maintaining a joint In this work we demonstrale an improved map representation and laser penetration model, an improvement in the asymptotic efficiency of the algorithm, and empirical mulls of loop closing on a high resolution map of a very challenging domain.

show abstract

Learning probabilistic motion models for mobile robots

Eliazar

Parr

2004

View full text Add to dashboard Cite

Machine learning methods are often applied to the problem of learning a map from a robot's sensor data, but they are rarely applied to the problem of learning a robot's motion model. The motion model, which can be influenced by robot idiosyncrasies and terrain properties, is a crucial aspect of current algorithms for Simultaneous Localization and Mapping (SLAM). In this paper we concentrate on generating the correct motion model for a robot by applying EM methods in conjunction with a current SLAM algorithm. In contrast to previous calibration approaches, we not only estimate the mean of the motion, but also the interdependencies between motion terms, and the variances in these terms. This can be used to provide a more focused proposal distribution to a particle filter used in a SLAM algorithm, which can reduce the resources needed for localization while decreasing the chance of losing track of the robot's position. We validate this approach by recovering a good motion model despite initialization with a poor one. Further experiments validate the generality of the learned model in similar circumstances.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ronald Parr

Efficient Solution Algorithms for Factored MDPs

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Analyzing feature generation for value-function approximation

Dp-Slam 2.0

Learning probabilistic motion models for mobile robots

Contact Info

Product

Resources

About