Partially Observable Markov Decision Processes in Robotics: A Survey

Lauri, Mikko; Hsu, David; Pajarinen, Joni

doi:10.1109/tro.2022.3200138

Cited by 60 publications

(18 citation statements)

References 157 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another strategy that can be used to reduce the per-update communications overhead of a distributed network is by using a Partially Observable Markov Decision Process (POMDP) [102]. Instead of requiring the agents to fully observe the environment in each time step, action selection for each agent is based on a probability distribution given by the model instead of directly observing the underlying state.…”

Section: B Research Opportunitiesmentioning

confidence: 99%

Digital Twin-Enabled Domain Adaptation for Zero-Touch UAV Networks: Survey and Challenges

McManus¹,

Cui²,

Josh³

et al. 2023

Preprint

View full text Add to dashboard Cite

In existing wireless networks, the control programs have been designed manually and for certain predefined scenarios. This process is complicated and error-prone, and the resulting control programs are not resilient to disruptive changes. Data-driven control based on Artificial Intelligence and Machine Learning (AI/ML) has been envisioned as a key technique to automate the modeling, optimization and control of complex wireless systems. However, existing AI/ML techniques rely on sufficient well-labeled data and may suffer from slow convergence and poor generalizability. In this article, focusing on digital twinassisted wireless unmanned aerial vehicle (UAV) systems, we provide a survey of emerging techniques that can enable fastconverging data-driven control of wireless systems with enhanced generalization capability to new environments. These include SLAM-based sensing and network softwarization for digital twin construction, robust reinforcement learning and system identification for domain adaptation, and testing facility sharing and federation. The corresponding research opportunities are also discussed.

show abstract

Section: B Research Opportunitiesmentioning

confidence: 99%

Digital Twin-Enabled Domain Adaptation for Zero-Touch UAV Networks: Survey and Challenges

McManus¹,

Cui²,

Josh³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…5.2) Sequential decision-making models are often embedded in larger systems. Often, MDPs are part of a larger system, such as a robot [4,40,136,214], and it may be challenging to write a reward function that represents its high-level task, which may be a mixture of several objectives. Thus, we often evaluate these decisionmaking models using a task-based metric [4,45,134,200,209].…”

Section: System Evaluation and Measurementmentioning

confidence: 99%

Fairness and Sequential Decision Making: Limits, Lessons, and Opportunities

Nashed¹,

Svegliato²,

Blodgett³

2023

Preprint

View full text Add to dashboard Cite

As automated decision making and decision assistance systems become common in everyday life, research on the prevention or mitigation of potential harms that arise from decisions made by these systems has proliferated. However, various research communities have independently conceptualized these harms, envisioned potential applications, and proposed interventions. The result is a somewhat fractured landscape of literature focused generally on ensuring decision-making algorithms "do the right thing". In this paper, we compare and discuss work across two major subsets of this literature: algorithmic fairness, which focuses primarily on predictive systems, and ethical decision making, which focuses primarily on sequential decision making and planning. We explore how each of these settings has articulated its normative concerns, the viability of different techniques for these different settings, and how ideas from each setting may have utility for the other.Finally, given the optimal value function * ( ), the optimal policy * ( ) can be calculated in the following way: * ( ) = arg max ∈ * ( ).

show abstract

“…Localization uncertainty can arise from perceptual degradation (Ebadi et al, 2020), noisy actuation (Thrun 2002), and inaccurate modeling (Roy et al, 1999). Decision-making or planning under uncertainty (LaValle 2006; Bry and Roy 2011; Preston et al, 2022) provides an elegant framework to formulate these problems using partially observable Markov decision processes (POMDPs) (Kaelbling et al, 1998; Cai et al, 2021; Lauri et al, 2022). A principled approach to address these problems is to plan in the belief space (Kaelbling and Lozano-Pérez 2013; Nishimura and Schwager 2021).…”

Section: Related Workmentioning

confidence: 99%

Adaptive Robotic Information Gathering via non-stationary Gaussian processes

Chen

Khardon

Liu

2023

The International Journal of Robotics Research

View full text Add to dashboard Cite

Robotic Information Gathering (RIG) is a foundational research topic that answers how a robot (team) collects informative data to efficiently build an accurate model of an unknown target function under robot embodiment constraints. RIG has many applications, including but not limited to autonomous exploration and mapping, 3D reconstruction or inspection, search and rescue, and environmental monitoring. A RIG system relies on a probabilistic model’s prediction uncertainty to identify critical areas for informative data collection. Gaussian processes (GPs) with stationary kernels have been widely adopted for spatial modeling. However, real-world spatial data is typically non-stationary—different locations do not have the same degree of variability. As a result, the prediction uncertainty does not accurately reveal prediction error, limiting the success of RIG algorithms. We propose a family of non-stationary kernels named Attentive Kernel (AK), which is simple and robust and can extend any existing kernel to a non-stationary one. We evaluate the new kernel in elevation mapping tasks, where AK provides better accuracy and uncertainty quantification over the commonly used stationary kernels and the leading non-stationary kernels. The improved uncertainty quantification guides the downstream informative planner to collect more valuable data around the high-error area, further increasing prediction accuracy. A field experiment demonstrates that the proposed method can guide an Autonomous Surface Vehicle (ASV) to prioritize data collection in locations with significant spatial variations, enabling the model to characterize salient environmental features.

show abstract

Partially Observable Markov Decision Processes in Robotics: A Survey

Cited by 60 publications

References 157 publications

Digital Twin-Enabled Domain Adaptation for Zero-Touch UAV Networks: Survey and Challenges

Digital Twin-Enabled Domain Adaptation for Zero-Touch UAV Networks: Survey and Challenges

Fairness and Sequential Decision Making: Limits, Lessons, and Opportunities

Adaptive Robotic Information Gathering via non-stationary Gaussian processes

Contact Info

Product

Resources

About