This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the representation size of structured MDPs, but the complexity of exact solution algorithms for such MDPs can grow exponentially in the representation size. In this paper, we present two approximate solution algorithms that exploit structure in factored MDPs. Both use an approximate value function represented as a linear combination of basis functions, where each basis function involves only a small subset of the domain variables. A key contribution of this paper is that it shows how the basic operations of both algorithms can be performed efficiently in closed form, by exploiting both additive and context-specific structure in a factored MDP. A central element of our algorithms is a novel linear program decomposition technique, analogous to variable elimination in Bayesian networks, which reduces an exponentially large LP to a provably equivalent, polynomial-sized one. One algorithm uses approximate linear programming, and the second approximate dynamic programming. Our dynamic programming algorithm is novel in that it uses an approximation based on max-norm, a technique that more directly minimizes the terms that appear in error bounds for approximate MDP algorithms. We provide experimental results on problems with over 10 40 states, demonstrating a promising indication of the scalability of our approach, and compare our algorithm to an existing state-of-the-art approach, showing, in some problems, exponential gains in computation time.
We show that linear value-function approximation is equivalent to a form of linear model approximation. We then derive a relationship between the model-approximation error and the Bellman error, and show how this relationship can guide feature selection for model improvement and/or value-function improvement. We also show how these results give insight into the behavior of existing feature-selection algorithms.
We analyze a simple, Bellman-error-based approach to generating basis functions for valuefunction approximation. We show that it generates orthogonal basis functions that provably tighten approximation error bounds. We also illustrate the use of this approach in the presence of noise on some sample problems.
Absfrtrcf-Probabilistic appmaches have proved very succecs-ambiguities as a natural part of the particle filtering process ful at addressing the basic problem of robot lofabtion and and obviates the explicit loop closing phase needed for other mapping and they have shown great promise on the combined approaches [61, 181, problem of simultaneous localization and mapping (SLAM). One appmch to relatively spane, relatively unamEfficiently maintaining the joint distribution required some higuous landmark builds a Kfilter over landmark significant data suucture engineering because maps are large positions. Other approaches assume dense sensor date which objects and a naive particle filter implementation would entail individually are not very distinctive, such as those available from huge of block memory copying as map hypotheses a laser range finder' In earlier work* we prerent* an algorithm progress through the particle filter. By maintaining a tree repcalled DP-SLAM, which provided a very accurate solution to the oyer resentation of where the paths of.different particles diverged, robot maps and poses. The approach assumed rn extremely . we exploited redundancies between the maps. As originally accurate lsser range finder and a deterministic environment. implemented, we were able to achieve a runtime which was edSe by emdenfly maintaining a joint In this work we demonstrale an improved map representation and laser penetration model, an improvement in the asymptotic efficiency of the algorithm, and empirical mulls of loop closing on a high resolution map of a very challenging domain.
Machine learning methods are often applied to the problem of learning a map from a robot's sensor data, but they are rarely applied to the problem of learning a robot's motion model. The motion model, which can be influenced by robot idiosyncrasies and terrain properties, is a crucial aspect of current algorithms for Simultaneous Localization and Mapping (SLAM). In this paper we concentrate on generating the correct motion model for a robot by applying EM methods in conjunction with a current SLAM algorithm. In contrast to previous calibration approaches, we not only estimate the mean of the motion, but also the interdependencies between motion terms, and the variances in these terms. This can be used to provide a more focused proposal distribution to a particle filter used in a SLAM algorithm, which can reduce the resources needed for localization while decreasing the chance of losing track of the robot's position. We validate this approach by recovering a good motion model despite initialization with a poor one. Further experiments validate the generality of the learned model in similar circumstances.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.