Solving Hidden-Semi-Markov-Mode Markov Decision Problems

Hadoux, Emmanuel; Beynier, Aurélie; Weng, Paul

doi:10.1007/978-3-319-11508-5_15

Cited by 10 publications

(10 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Future work will explore more sophisticated models of the adversaries. We would like to include the temporal dimension in the context changes as it could have been done in HS3MDPs 28 . Models inspired from behavioral economics could also be useful to develop more accurate profile of the adversaries.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

A Multiagent Planning Approach for Cooperative Patrolling with Non-Stationary Adversaries

Beynier

2017

Int. J. Artif. Intell. Tools

Self Cite

View full text Add to dashboard Cite

Multiagent patrolling is the problem faced by a set of agents that have to visit a set of sites to prevent or detect some threats or illegal actions. Although it is commonly assumed that patrollers share a common objective, the issue of cooperation between the patrollers has received little attention. Over the last years, the focus has been put on patrolling strategies to prevent a one-shot attack from an adversary. This adversary is usually assumed to be fully rational and to have full observability of the system. Most approaches are then based on game theory and consists in computing a best response strategy. Nonetheless, when patrolling frontiers, detecting illegal fishing or poaching; patrollers face multiple adversaries with limited observability and rationality. Moreover, adversaries can perform multiple illegal actions over time and space and may change their strategies as time passes. In this paper, we propose a multiagent planning approach that enables effective cooperation between a team of patrollers in uncertain environments. Patrolling agents are assumed to have partial observability of the system. Our approach allows the patrollers to learn a generic and stochastic model of the adversaries based on the history of observations. A wide variety of adversaries can thus be considered with strategies ranging from random behaviors to fully rational and informed behaviors. We show that the multiagent planning problem can be formalized by a non-stationary DEC- POMDP. In order to deal with the non-stationary, we introduce the notion of context. We then describe an evolutionary algorithm to compute patrolling strategies on-line, and we propose methods to improve the patrollers’ performance.

show abstract

Section: Resultsmentioning

confidence: 99%

“…Following previous works dealing with non-stationary mono-agent POMDPs 28 , we decompose the non-stationary decision problem as a series of stationary decision problems. Each stationary phase is then referred to as a mode or a context.…”

Section: Dec-pomdp Model For a Current Contextmentioning

confidence: 99%

A Multiagent Planning Approach for Cooperative Patrolling with Non-Stationary Adversaries

Beynier

2017

Int. J. Artif. Intell. Tools

Self Cite

View full text Add to dashboard Cite

show abstract

“…Besides, the parameter may not be observable. In that case, a model like Hidden-Semi-Markov-Mode MDP proposed by [8] could be exploited.…”

Section: Resultsmentioning

confidence: 99%

Optimal Threshold Policies for Robust Data Center Control

Weng

Qiu

Costanzo

et al. 2017

Lecture Notes in Electrical Engineering

Self Cite

View full text Add to dashboard Cite

With the simultaneous rise of energy costs and demand for cloud computing, efficient control of data centers becomes crucial. In the data center control problem, one needs to plan at every time step how many servers to switch on or off in order to meet stochastic job arrivals while trying to minimize electricity consumption. This problem becomes particularly challenging when servers can be of various types and jobs from different classes can only be served by certain types of server, as it is often the case in real data centers. We model this problem as a robust Markov Decision Process (i.e., the transition function is not assumed to be known precisely). We give sufficient conditions (which seem to be reasonable and satisfied in practice) guaranteeing that an optimal threshold policy exists. This property can then be exploited in the design of an efficient solving method, which we provide. Finally, we present some experimental results demonstrating the practicability of our approach and compare with a previous related approach based on model predictive control.

show abstract

“…Inspired by the notion of modes used in mono-agent POMDP [18], our approach consists in defining a DEC-POMDP for a given distribution P I which will be referred as the current context of the decision making. As discussed later in the paper, the DEC-POMDP definition will have to be updated as the context evolves.…”

Section: B Patrollers' Dec-pomdp Formulationmentioning

confidence: 99%

Cooperative Multiagent Patrolling for Detecting Multiple Illegal Actions under Uncertainty

Beynier¹

2016

2016 IEEE 28th International Conference on Tools With Artificial Intelligence (ICTAI)

View full text Add to dashboard Cite

Multiagent patrolling in adversarial domains has been widely studied in recent years. However, little attention has been paid to cooperation issues between patrolling agents. Moreover, most existing works focus on one-shot attacks and assume full rationality of the adversaries. Nonetheless, when patrolling frontiers, detecting illegal fishing or poaching; security forces face several adversaries with limited observability and rationality, that perform multiple illegal actions spread in time and space. In this paper, we develop a cooperative approach to improve defenders efficiency in such settings. We propose a new formalization of multiagent patrolling problems allowing for effective cooperation between the defenders. Our work accounts for uncertainty on action outcomes and partial observability of the system. Unlike existing security games, a generic model of the opponents is considered thus handling limited observability and bounded rationality of the adversaries. We then describe a learning mechanism allowing the defenders to take advantage of their observations about the adversaries and to compute cooperative patrolling strategies consequently.

show abstract

Solving Hidden-Semi-Markov-Mode Markov Decision Problems

Cited by 10 publications

References 11 publications

A Multiagent Planning Approach for Cooperative Patrolling with Non-Stationary Adversaries

A Multiagent Planning Approach for Cooperative Patrolling with Non-Stationary Adversaries

Optimal Threshold Policies for Robust Data Center Control

Cooperative Multiagent Patrolling for Detecting Multiple Illegal Actions under Uncertainty

Contact Info

Product

Resources

About