2016
DOI: 10.1016/j.artint.2016.02.004
|View full text |Cite
|
Sign up to set email alerts
|

Belief and truth in hypothesised behaviours

Abstract: There is a long history in game theory on the topic of Bayesian or "rational" learning, in which each player maintains beliefs over a set of alternative behaviours, or types, for the other players. This idea has gained increasing interest in the artificial intelligence (AI) community, where it is used as a method to control a single agent in a system composed of multiple agents with unknown behaviours. The idea is to hypothesise a set of types, each specifying a possible behaviour for the other agents, and to … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
27
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 35 publications
(27 citation statements)
references
References 60 publications
0
27
0
Order By: Relevance
“…The learning problem for this agent is made difficult by the fact that information is censored , i.e., the publisher knows if an impression is sold but no other quantitative information. We address this problem using the Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm [1,3], which conceptualises this interaction in terms of a Stochastic Bayesian Game and arrives at optimal actions by best responding with respect to probabilistic beliefs maintained over a candidate set of opponent behaviour profiles. We adapt and apply HBA to the censored information setting of ad exchanges.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…The learning problem for this agent is made difficult by the fact that information is censored , i.e., the publisher knows if an impression is sold but no other quantitative information. We address this problem using the Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm [1,3], which conceptualises this interaction in terms of a Stochastic Bayesian Game and arrives at optimal actions by best responding with respect to probabilistic beliefs maintained over a candidate set of opponent behaviour profiles. We adapt and apply HBA to the censored information setting of ad exchanges.…”
mentioning
confidence: 99%
“…We propose that this problem may be addressed by drawing on recent developments in machine learning, which allow tractable learning despite the incompleteness of models. In particular, we use the Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm [1,3], which conceptualises the interaction in terms of a space of 'types' (or opponent policies), over which the procedure maintains beliefs and uses the beliefs to guide optimal action selection. The attraction of this algorithm is that it can be shown to be optimal even when the hypothesised type space is not exactly correct but only approximates the possible behaviours of the opponent.…”
mentioning
confidence: 99%
“…This includes an algorithm for online planning in ad hoc teams (OPAT) [74] that solves a series of stage games assuming that other agents are optimal with the utility at each stage computed using Monte Carlo tree search. Albrecht and Ramamoorthy [6,8] generalize Bayesian games [43] to model the uncertainty about other agents' user-defined types and construct a Harsanyi-Bayesian ad hoc coordination game (HBA) that is solved online using learning. However, establishing common knowledge of the prior distribution over types to facilitate the solution of HBAs is problematic in ad hoc settings.…”
Section: Related Workmentioning
confidence: 99%
“…Recent work on Stochastic Bayesian Games has compared several ways to incorporate observations into beliefs over opponent types when those types are re-drawn after every state transition [1]. In contrast, we assume that opponents are redrawn for interactions over several repeated stochastic games.…”
Section: Introductionmentioning
confidence: 99%
“…Second, in an online phase a random process pairs the learning agent against the opponent for a stochastic game. 1 The learning agent has no control over this process and does not observe the opponent identity. When the game finishes the learning agent receives an observation (reward) and updates the belief accordingly.…”
Section: Introductionmentioning
confidence: 99%