Calculating optimal policies is known to be computationally difficult for Markov decision processes (MDPs) with Borel state and action spaces. This paper studies finite-state approximations of discrete time Markov decision processes with Borel state and action spaces, for both discounted and average costs criteria. The stationary policies thus obtained are shown to approximate the optimal stationary policy with arbitrary precision under quite general conditions for discounted cost and more restrictive conditions for average cost. For compact-state MDPs, we obtain explicit rate of convergence bounds quantifying how the approximation improves as the size of the approximating finite state space increases. Using information theoretic arguments, the order optimality of the obtained convergence rates is established for a large class of problems. We also show that as a pre-processing step, the action space can also be finitely approximated with sufficiently large number points; thereby, well known algorithms, such as value or policy iteration, Q-learning, etc., can be used to calculate near optimal policies.
Abstract. This paper is concerned with the properties of the sets of strategic measures induced by admissible team policies in decentralized stochastic control and the convexity properties in dynamic team problems. To facilitate a convex analytical approach, strategic measures for team problems are introduced. Properties such as convexity, compactness and Borel measurability under weak convergence topology are studied, and sufficient conditions for each of these properties are presented. These lead to existence of and structural results for optimal policies. It will be shown that the set of strategic measures for teams which are not classical is in general non-convex, but the extreme points of a relaxed set consist of deterministic team policies, which lead to their optimality for a given team problem under an expected cost criterion. Externally provided independent common randomness for static teams or private randomness for dynamic teams do not improve the team performance. The problem of when a sequential team problem is convex is studied and necessary and sufficient conditions for problems which include teams with a non-classical information structure are presented. Implications of this analysis in identifying probability and information structure dependent convexity properties are presented.Key words. Stochastic control, decentralized control, optimal control, convex analysis.AMS subject classifications. 93E03, 90B99, 49J551. Introduction. Team decision theory has its roots in control theory and economics. Marschak [37] was perhaps the first to introduce the basic elements of teams, and to provide the first steps toward the development of a team theory. Radner [42] provided foundational results for static teams, establishing connections between person-by-person optimality, stationarity, and team-optimality [38]. Contributions of Witsenhausen [56,57,58,54,53] on dynamic teams and characterization of information structures have been crucial in the progress of our understanding of dynamic teams. We refer the reader to Section 1.1, where Witsenhausen's intrinsic model, and characterization of information structures are discussed in detail. Further discussion on design of information structures in the context of team theory is available in [5,48,61].Convexity is a very important property for optimization problems. A property related to convex analysis that is relevant in team problems is the characterization of the sets of strategic measures; these are the probability measures induced on the exogenous variables, and measurement and action spaces by admissible control policies. In the context of single decision maker control problems, such measures have been studied extensively in [45,41,24,27]. A study of strategic measures for team problems has not been made to our knowledge, and it will be observed in this paper that many of the properties that are natural for fully-observed single-decision-maker stochastic control problems, such as convexity, do not generally extend to a large class of stochastic team problem...
In this paper, we consider discrete-time dynamic games of the mean-field type with a finite number N of agents subject to an infinite-horizon discounted-cost optimality criterion. The state space of each agent is a locally compact Polish space. At each time, the agents are coupled through the empirical distribution of their states, which affects both the agents' individual costs and their state transition probabilities. We introduce a new solution concept of the Markov-Nash equilibrium, under which a policy is player-by-player optimal in the class of all Markov policies. Under mild assumptions, we demonstrate the existence of a mean-field equilibrium in the infinitepopulation limit N → ∞, and then show that the policy obtained from the mean-field equilibrium is approximately Markov-Nash when the number of agents N is sufficiently large.
Establishing the existence of Nash equilibria for partially observed stochastic dynamic games is known to be quite challenging, with the difficulties stemming from the noisy nature of the measurements available to individual players (agents) and the decentralized nature of this information. When the number of players is sufficiently large and the interactions among agents is of the mean-field type, one way to overcome this challenge is to investigate the infinite-population limit of the problem, which leads to a mean-field game. In this paper, we consider discrete-time partially observed mean-field games with infinite-horizon discounted-cost criteria. Using the technique of converting the original partially observed stochastic control problem to a fully observed one on the belief space and the dynamic programming principle, we establish the existence of Nash equilibria for these game models under very mild technical conditions. Then, we show that the mean-field equilibrium policy, when adopted by each agent, forms an approximate Nash equilibrium for games with sufficiently many agents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.