An analysis of optimistic, best-first search for minimax sequential decision making

Buşoniu, Lucian; Munos, Rémi; Páll, Előd

doi:10.1109/adprl.2014.7010615

Cited by 8 publications

(20 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We extend the analysis of OMS in [5] to OMSδ. The first part of our analysis establishes basic properties of the minimax algorithm that still hold under the additional dwell-time constraints.…”

Section: Discussionmentioning

confidence: 99%

“…The second part gives our main novel results: a complexity measure of the problem and a corresponding convergence rate of OMSδ. Due to space limits we skip all proofs except that of the main result, but where applicable we point out relations to [5].…”

Section: Discussionmentioning

confidence: 99%

“…At each iteration, to choose the next leaf to expand, OMSδ starts from the root and constructs a path by recursively selecting an optimistic child for the agent at the current node, in the same way as OMS in [5]: a child with the largest upper bound at max nodes, and one with the smallest lower bound at min nodes. The main difference between OMSδ and OMS is in the expansion of this leaf, which in OMSδ is constrained to only create the children that obey the dwell time conditions, as explained above.…”

Section: Algorithmmentioning

confidence: 99%

“…The modes can have arbitrary nonlinear dynamics, while the rewards must be bounded. This is a minimax problem, which we solve by extending the approach from [5], called optimistic minimax search (OMS). OMS explores a tree representation of the possible sequences of max and min agent actions (here, mode switches); it is a variant of B* [17] and related to other classical minimax search methods [10], [18], [12].…”

Section: Introductionmentioning

confidence: 99%

“…In the context of artificial intelligence and optimistic planning [16], [11], [9], [22], [15], the closest algorithm is again OMS [5], compared to which the main novelty here is the convergence analysis under dwell-time constraints, leading to a new complexity measure. The planning method for max-only switched systems from our work [3] leads to a similar complexity measure and reduction compared to the no-dwell-time case, but there the analysis is much easier due to the lack of the min agent.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Near-optimal control of nonlinear switched systems with non-cooperative switching rules

Rejeb

Buşoniu

Morărescu

et al. 2017

2017 American Control Conference (ACC)

Self Cite

View full text Add to dashboard Cite

This paper presents a predictive, planning algorithm for nonlinear switched systems where there are two switching signals, one controlled and the other uncontrolled, both subject to constraints on the dwell time after a switch. The algorithm solves a minimax problem where the controlled signal is chosen to optimize a discounted sum of rewards, while taking into account the worst possible uncontrolled switches. It is an extension of a classical minimax search method, so we call it optimistic minimax search with dwell time constraints, OMSδ. For any combination of dwell times, OMSδ returns a sequence of switches that is provably near-optimal, and can be applied in receding horizon for closed loop control. For the case when the two dwell times are the same, we provide a convergence rate to the minimax optimum as a function of the computation invested, modulated by a measure of problem complexity. We show how the framework can be used to model switched systems with time delays on the control channel, and provide an illustrative simulation for such a system with nonlinear modes.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Algorithmmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Near-optimal control of nonlinear switched systems with non-cooperative switching rules

Rejeb

Buşoniu

Morărescu

et al. 2017

2017 American Control Conference (ACC)

Self Cite

View full text Add to dashboard Cite

show abstract

Planning for optimal control and performance certification in nonlinear systems with controlled or uncontrolled switches

et al. 2017

View full text Add to dashboard Cite

We consider three problems for discrete-time switched systems with autonomous, general nonlinear modes. The first is optimal control of the switching rule so as to optimize the infinite-horizon discounted cost. The second and third problems occur when the switching rule is uncontrolled, and we seek either the worst-case cost when the rule is unknown, or respectively the expected cost when the rule is stochastic. We use optimistic planning (OP) algorithms that can solve general optimal control with discrete inputs such as switches. We extend the analysis of OP to provide certification (upper and lower) bounds on the optimal, worst-case, or expected costs, as well as to design switching sequences that achieve these bounds in the deterministic case. In this case, since a minimum dwell time between switching instants must often be ensured, we introduce a new OP variant to handle this constraint, and analyze its convergence rate. We provide consistency and closed-loop performance guarantees for the sequences designed, and illustrate that the approach works well in simulations.

show abstract

Optimistic minimax search for noncooperative switched control with or without dwell time

et al. 2020

View full text Add to dashboard Cite

We consider adversarial problems in which two agents control two switching signals, the first agent aiming to maximize a discounted sum of rewards, and the second aiming to minimize it. Both signals may be subject to constraints on the dwell time after a switch. We search the tree of possible mode sequences with an algorithm called optimistic minimax search with dwell time (OMSd), showing that it obtains a solution close to the minimax-optimal one, and we characterize the rate at which the suboptimality goes to zero. The analysis is driven by a novel measure of problem complexity, and it is first given in the general dwell-time case, after which it is specialized to the unconstrained case. We exemplify the framework for networked control systems where the minimizer signal is a discrete time delay on the control channel, and we provide extensive simulations and a real-time experiment for nonlinear systems of this type.

show abstract

An analysis of optimistic, best-first search for minimax sequential decision making

Cited by 8 publications

References 19 publications

Near-optimal control of nonlinear switched systems with non-cooperative switching rules

Near-optimal control of nonlinear switched systems with non-cooperative switching rules

Planning for optimal control and performance certification in nonlinear systems with controlled or uncontrolled switches

Optimistic minimax search for noncooperative switched control with or without dwell time

Contact Info

Product

Resources

About