Most of the work in evolutionary game theory starts with a model of a social situation that gives rise to a particular payoff matrix and analyses how behaviour evolves through natural selection. Here, we invert this approach and ask, given a model of how individuals behave, how the payoff matrix will evolve through natural selection. In particular, we ask whether a prisoner's dilemma game is stable against invasions by mutant genotypes that alter the payoffs. To answer this question, we develop a two-tiered framework with goal-oriented dynamics at the behavioural time scale and a diploid population genetic model at the evolutionary time scale. Our results are two-fold: first, we show that the prisoner's dilemma is subject to invasions by mutants that provide incentives for cooperation to their partners, and that the resulting game is a coordination game similar to the hawk-dove game. Second, we find that for a large class of mutants and symmetric games, a stable genetic polymorphism will exist in the locus determining the payoff matrix, resulting in a complex pattern of behavioural diversity in the population. Our results highlight the importance of considering the evolution of payoff matrices to understand the evolution of animal social systems.Keywords: evolutionary game theory; prisoner's dilemma; hawk -dove game; behavioural dynamics; two-tiered model
INTRODUCTIONEvolutionary game theory (EGT) is one of the fundamental tools to study how behaviour and traits of organisms evolve by natural selection. An evolutionary game is defined by different genetical strategies and their fitness (i.e. reproductive output) when they interact with each other. The genotypes increase or decrease in frequency according to their fitness given the frequencies of other strategies in the population. This process frequently (but not always) leads to an evolutionarily stable strategy (ESS), which is a strategy that, when fixed in the population, cannot be invaded by alternative strategies. Earlier EGT models in biology tended to assume that the genetical strategies correspond to actual behaviours [1], or simple conditional rules that prescribe a certain behaviour given the state of the individual and the interaction (e.g. the tit-for-tat strategy [2]). More recent work has focused on interactions where individuals' behaviour is not directly determined by their genes, but instead reflect the outcome of a dynamical process where the players respond to each other according to proximate mechanisms that prescribe their behaviour [3][4][5][6][7]. In particular, Roughgarden [8] calls for an explicitly two-tiered conception of behavioural evolution: the first tier describes the dynamics of behaviour within the time scale of an interaction where individuals can adjust their actions in response to the context and the behaviours of others. The second tier, on the other