Shortest path problems on stochastic graphs: a neuro dynamic programming approach

Baglietto, Marco; Battistelli, Giorgio; Vitali, F.; Zoppoli, R.

doi:10.1109/cdc.2003.1272268

Cited by 7 publications

(12 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One potential direction for future research would be to identify bounds tighter than those discussed in this work, which would potentially result in more aggressive node pruning and consequently reduce execution time. One other exciting direction for future research would be to use CAO * in conjunction with approximation schemes for CTP (Baglietto et al 2003, de Farias and Roy 2003, Chang and Marcus 2003, Kearns and Singh 2002. CAO * can also be converted into a heuristic method by employing stronger, yet suboptimal pruning techniques.…”

Section: Discussionmentioning

confidence: 99%

“…The goal here is to find a policy that decides what and where to disambiguate en route so as to minimize the expected length of the traversal. Several 97 heuristics and approximation algorithms have been introduced for CTP in the literature (Baglietto et al 2003, Xu et al 2009, Eyerich et al 2009) and optimal algorithms for certain special cases of CTP have been proposed (Ferguson et al 2004, Nikolova and Karger 2008, Bnaya et al 2011.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

An AO^* Based Exact Algorithm for the Canadian Traveler Problem

Aksakallı

Sahin

Arı

2016

INFORMS Journal on Computing

View full text Add to dashboard Cite

T he Canadian traveler problem (CTP) is a simple, yet challenging, stochastic optimization problem wherein an agent is given a graph where some edges are blocked with certain probabilities and the status of these edges can be disambiguated dynamically upon reaching an incident vertex. The goal is to devise a traversal policy that results in the shortest expected walk length between a given starting vertex and a termination vertex. CTP has been shown to be intractable in many broad settings. In this paper, we introduce an optimal algorithm for the problem based on a Markov decision process formulation, which is a new improvement on AO * search that takes advantage of the special problem structure in CTP. We call our algorithm CAO * , which stands for AO * with caching. CAO * uses a caching mechanism to avoid re-expansion of previously visited states and makes use of admissible upper bounds at a node level for dynamic state-space pruning. CAO * is not polynomial time, but it can dramatically shorten the execution time needed to find an exact solution for moderately sized instances. We present computational experiments on a realistic variant of the problem involving an actual maritime minefield data set.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

An AO^* Based Exact Algorithm for the Canadian Traveler Problem

Aksakallı

Sahin

Arı

2016

INFORMS Journal on Computing

View full text Add to dashboard Cite

show abstract

“…Then the agent continues to t through A 5 or counterclockwise around A 5 , according as A 5 is traversable or not. If A 4 was not traversable then the agent was to continue to t counterclockwise around A 4 and A 5 . Under this policy, the agent's s, t traversal is an s, t-curve-valued random variable which would be γ 1 , γ 2 , γ 3 , γ 4 [18 + 2.2].…”

Section: The Disambiguation Problemmentioning

confidence: 99%

The reset disambiguation policy for navigating stochastic obstacle fields

Aksakallı

Fishkind

Priebe

et al. 2011

Naval Research Logistics

View full text Add to dashboard Cite

Abstract:The problem we consider is a stochastic shortest path problem in the presence of a dynamic learning capability. Specifically, a spatial arrangement of possible obstacles needs to be traversed as swiftly as possible, and the status of the obstacles may be disambiguated (at a cost) en route. No efficiently computable optimal policy is known, and many similar problems have been proven intractable. In this article, we adapt a policy which is optimal for a related problem and prove that this policy is indeed also optimal for a restricted class of instances of our problem. Otherwise, this policy is generally suboptimal but, nonetheless, it is both effective and efficiently computable. Examples/simulations are provided in a mine countermeasures application. Of central use is the Tangent Arc Graph, a polynomially sized topological superimposition of exponentially many visibility graphs. © 2011 Wiley Periodicals, Inc. Naval Research Logistics 58: 389-399, 2011 Keywords: mine countermeasures; probabilistic path planning; random disambiguation path; tangent arc graph; markov decision process THE DISAMBIGUATION PROBLEMA disambiguation problem instance is a tuple (s, t, A, ρ, c), where s and t are points in R 2 , A is a finite set of open discs in R 2 , ρ is a function A → (0, 1], and c is a function A → R ≥0 . An agent wants to traverse from s to t through R 2 , along a continuous curve which is as short as possible in the sense of arclength. However, the discs of A are potential obstacles; for each A ∈ A, the probability that A is an obstacle is ρ(A), independently from the other discs in A. If ρ(A) < 1 then we say A is ambiguous and if ρ(A) = 1 then A is definitely an obstacle. The traversing agent cannot enter discs which are obstacles or ambiguous but, if and when the agent is located at the boundary ∂A for any A ∈ A, the agent has the option to disambiguate the disc A at a cost c(A) added to the traversal arclength, and the agent will learn whether or not A is actually an obstacle. The status of a disc will never change; if A is revealed to be an obstacle then the traversing agent may never enter A, and if A is not an obstacle then A may be entered anytime thereafter. The central issue is how to direct the agent's traversal to optimally utilize this disambiguation capability; that is, to find a policy for the agent which minimizes the expected length of the agent's s, t traversal.Correspondence to: C.E. Priebe (cep@jhu.edu) An example of a disambiguation problem instance is shown in Fig. 1; suppose the values of ρ(A i ), for i = 1, 2, 3, 4, 5 are 0.6, 0.4, 0.9, 0.8, 0.7, and suppose c(A i ) = 1.1 for all i. One particular traversal policy is illustrated in Fig. 1; from s the agent proceeds to the red bullet labeled 1, at which point A 1 is disambiguated. If A 1 is traversable then the agent is to continue till the red bullet labeled 2, at which point A 2 is disambiguated. Then the agent is to proceed to t through A 2 or clockwise around A 2 , according as A 2 is traversable or not. If A 1 was not traversa...

show abstract

“…Heuristics are suggested for CTP and SOSP in [2], [4], [5], and [12], but they would not be applicable to the problem we address here in this manuscript without initially approximating and recasting our continuous setting to the setting of a finite graph, in which case the resolution of the discretization drives up the number of vertices and edges in the approximating graph. By contrast, the algorithm we propose here is polynomial-time solely in the number of detections |X|.…”

Section: Overviewmentioning

confidence: 99%

“…Next, in Section 4, we address the issue of how, in general, to select an optimal or near-optimal value for α. Poisson(50), and the true and false marks are Beta(6, 2) and Beta (2,6). We adopted the starting point s = (−11, 110), destination point t = (66, 110), disc radius r = 10, disambiguation cost c = 1, and the number of available disambiguations K = 4.…”

Section: Mine Countermeasures Examplementioning

confidence: 99%

Disambiguation Protocols Based on Risk Simulation

Fishkind

Priebe

Giles³

et al. 2007

IEEE Trans. Syst., Man, Cybern. A

View full text Add to dashboard Cite

Suppose there is a need to swiftly navigate through a spatial arrangement of possibly forbidden regions, each region marked with the probability that it is indeed forbidden. In close proximity to any of these regions, you have the dynamic capability of disambiguating the region and learning for certain whether or not the region is forbidden-only in the latter case may you proceed through that region. The central issue is how to most effectively exploit this disambiguation capability to minimize the expected length of the traversal.Regions are never entered while they are possibly forbidden and, thus, no risk is ever actually incurred. Nonetheless, for the sole purpose of deciding where to disambiguate, it may be advantageous to simulate risk, temporarily pretending that possibly forbidden regions are riskily traversable, and each potential traversal is weighted with its level of undesirability, which is a function of its traversal length and traversal risk. Introduced in this paper is the simulated risk disambiguation protocol, which has you follow along a shortest traversal-in this undesirability sense-until an ambiguous region is about to be entered; at that location, a disambiguation is performed on this ambiguous region. (The process is then repeated from the current location, until the destination is reached.)We introduce the tangent arc graph as a means of simplifying the implementation of simulated risk disambiguation protocols, and we show how to efficiently implement the simulated risk disambiguation protocols which are based on linear undesirability functions. The effectiveness of these disambiguation protocols is illustrated with examples, including an example involving mine countermeasures path planning.

show abstract

Shortest path problems on stochastic graphs: a neuro dynamic programming approach

Cited by 7 publications

References 8 publications

An AO^* Based Exact Algorithm for the Canadian Traveler Problem

An AO^* Based Exact Algorithm for the Canadian Traveler Problem

The reset disambiguation policy for navigating stochastic obstacle fields

Disambiguation Protocols Based on Risk Simulation

Contact Info

Product

Resources

About

Shortest path problems on stochastic graphs: a neuro dynamic programming approach

Cited by 7 publications

References 8 publications

An AO* Based Exact Algorithm for the Canadian Traveler Problem

An AO* Based Exact Algorithm for the Canadian Traveler Problem

The reset disambiguation policy for navigating stochastic obstacle fields

Disambiguation Protocols Based on Risk Simulation

Contact Info

Product

Resources

About

An AO^* Based Exact Algorithm for the Canadian Traveler Problem

An AO^* Based Exact Algorithm for the Canadian Traveler Problem