From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques. 6 One such equilibrium occurs when players do not intentionally communicate information to other players, and ignore what other players tell them (historically called a pooling equilibrium in pure signalling games [15], or a babbling equilibrium in later work using cheap talk [16]). In this case, there is no incentive for a player to start communicating because they will be ignored, and there is no incentive to pay attention to other players because they are not communicating.7 In pure signalling games where actions are purely communicative, policies are often referred to as communication protocols. Though Hanabi is not such a pure signalling game, when we want to emphasize the communication properties of an agent's policy we will still refer to its communication protocol. 8 We use the word convention to refer to the parts of a communication protocol or policy that interrelate. Technically, these can be thought of as constraints on the policy to enact the convention.
Problems of cooperation-in which agents seek ways to jointly improve their welfare-are ubiquitous and important. They can be found at scales ranging from our daily routines-such as driving on highways, scheduling meetings, and working collaboratively-to our global challenges-such as peace, commerce, and pandemic preparedness. Arguably, the success of the human species is rooted in our ability to cooperate. Since machines powered by artificial intelligence are playing an ever greater role in our lives, it will be important to equip them with the capabilities necessary to cooperate and to foster cooperation.We see an opportunity for the field of artificial intelligence to explicitly focus effort on this class of problems, which we term Cooperative AI. The objective of this research would be to study the many aspects of the problems of cooperation and to innovate in AI to contribute to solving these problems. Central goals include building machine agents with the capabilities needed for cooperation, building tools to foster cooperation in populations of (machine and/or human) agents, and otherwise conducting AI research for insight relevant to problems of cooperation. This research integrates ongoing work on multi-agent systems, game theory and social choice, human-machine interaction and alignment, natural-language processing, and the construction of social tools and platforms. However, Cooperative AI is not the union of these existing areas, but rather an independent bet about the productivity of specific kinds of conversations that involve these and other areas. We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.Conversations on Cooperative AI can be organized in part in terms of the dimensions of cooperative opportunities. These include the strategic context, the extent of common versus conflicting interest, the kinds of entities who are cooperating, and whether researchers take the perspective of an individual or of a social planner. Conversations can also be focused on key capabilities necessary for cooperation, such as understanding, communication, cooperative commitments, and cooperative institutions. Finally, research should study the potential downsides of cooperative capabilities-such as exclusion and coercion-and how to channel cooperative capabilities to best improve human welfare. This research would connect AI research to the broader scientific enterprise studying the problem of cooperation, and to the broader social effort to solve cooperation problems. This conversation will continue at: www.cooperativeAI.com
This paper provides several theoretical results for empirical game theory. Specifically, we introduce bounds for empirical game theoretical analysis of complex multi-agent interactions. In doing so we provide insights in the empirical meta game showing that a Nash equilibrium of the estimated meta-game is an approximate Nash equilibrium of the true underlying metagame. We investigate and show how many data samples are required to obtain a close enough approximation of the underlying game. Additionally, we extend the evolutionary dynamics analysis of meta-games using heuristic payoff tables (HPTs) to asymmetric games. The stateof-the-art has only considered evolutionary dynamics of symmetric HPTs in which agents have access to the same strategy sets and the payoff structure is symmetric, implying that agents are interchangeable. Finally, we carry out an empirical illustration of the generalised method in several domains, illustrating the theory and evolutionary dynamics of several versions of the AlphaGo algorithm (symmetric), the dynamics of the Colonel Blotto game played by human players on Facebook (symmetric), the dynamics of several teams of players in the capture the flag game (symmetric), and an example of a meta-game in Leduc Poker (asymmetric), generated by the policy-space response oracle multi-agent learning algorithm. Keywords Empirical games • Asymmetric games • Replicator dynamics 1 Introduction Using game theory to examine multi-agent interactions in complex systems is a non-trivial task, especially when a payoff table or normal form representation is not directly available. Works by Walsh et al. [39,40], Wellman et al. [43,44], and Phelps et al. [23], have shown the great potential of using heuristic strategies and empirical game theory to examine such B Karl Tuyls
We propose a connected prescription formula in twistor space for all tree-level form factors of the stress tensor multiplet operator in N = 4 super Yang-Mills, which is a generalisation of the expression of Roiban, Spradlin and Volovich for superamplitudes. By introducing link variables, we show that our formula is identical to the recently proposed four-dimensional scattering equations for form factors. Similarly to the case of amplitudes, the link representation of form factors is shown to be directly related to BCFW recursion relations, and is considerably more tractable than the scattering equations. We also discuss how our results are related to a recent Grassmannian formulation of form factors, and comment on a possible derivation of our formula from ambitwistor strings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.