The Hawk-Dove mathematical game offers a paradigm of the trade-offs associated with aggressive and passive behaviors. When two (or more) populations of players (animals, insect populations, countries in military conflict, economic competitors, microbial communities, populations of co-evolving tumor cells, or reinforcement learners adopting different strategies) compete, their success or failure can be measured by their frequency in the population (successful behavior is reinforced, unsuccessful behavior is not), and the system is governed by the replicator dynamical system. We develop a time-dependent optimal-adaptive control theory for this nonlinear dynamical system in which the payoffs of the Hawk-Dove payoff matrix are dynamically altered (dynamic incentives) to produce (bang-bang) control schedules that (i) maximize the aggressive population at the end of time T , and (ii) minimize the aggressive population at the end of time T . These two distinct time-dependent strategies produce upper and lower bounds on the outcomes from all strategies since they represent two extremizers of the cost function using the Pontryagin maximum (minimum) principle. We extend the results forward to times nT (n = 1, ..., 5) in an adaptive way that uses the optimal value at the end of time nT to produce the new schedule for time (n + 1)T . Two special schedules and initial conditions are identified that produce absolute maximizers and minimizers over an arbitrary number of cycles for 0 ≤ T ≤ 3. For T > 3, our optimum schedules can drive either population to extinction or fixation. The method described can be used to produce optimal dynamic incentive schedules for many different applications in which the 2 × 2 replicator dynamics is used as a governing model.
We model Covid-19 vaccine uptake as a reinforcement learning dynamic between two populations: the vaccine adopters, and the vaccine hesitant. Using data available from the Center for Disease Control (CDC), we calculate a payoff matrix governing the dynamic interaction between these two groups and show they are playing a Hawk-Dove evolutionary game with an internal evolutionarily stable Nash equilibrium (the asymptotic percentage of vaccinated in the population). We then ask whether vaccine adoption can be improved by implementing dynamic incentive schedules that reward/punish the vaccine hesitant, and if so, what schedules are optimal and how effective are they likely to be? When is the optimal time to start an incentive program, and how large should the incentives be? By using a tailored replicator dynamic reinforcement learning model together with optimal control theory, we show that well designed and timed incentive programs can improve vaccine uptake by shifting the Nash equilibrium upward in large populations, but only so much, and incentive sizes above a certain threshold show diminishing returns.
The Hawk-Dove mathematical game offers a paradigm of the trade-offs associated with aggressive and passive behaviors. When two (or more) populations of players (animals, insect populations, countries in military conflict, economic competitors, microbial communities, populations of co-evolving tumor cells, or reinforcement learners adopting different strategies) compete, their success or failure can be measured by their frequency in the population (successful behavior is reinforced, unsuccessful behavior is not), and the system is governed by the replicator dynamical system. We develop a time-dependent optimal-adaptive control theory for this nonlinear dynamical system in which the payoffs of the Hawk-Dove payoff matrix are dynamically altered (dynamic incentives) to produce (bang-bang) control schedules that (i) maximize the aggressive population at the end of time T, and (ii) minimize the aggressive population at the end of time T. These two distinct time-dependent strategies produce upper and lower bounds on the outcomes from all strategies since they represent two extremizers of the cost function using the Pontryagin maximum (minimum) principle. We extend the results forward to times nT (n = 1, ..., 5) in an adaptive way that uses the optimal value at the end of time nT to produce the new schedule for time (n+1)T. Two special schedules and initial conditions are identified that produce absolute maximizers and minimizers over an arbitrary number of cycles for $0 \le T \le 3$. For T > 3, our optimum schedules can drive either population to extinction or fixation. The method described can be used to produce optimal dynamic incentive schedules for many different applications in which the 2 x 2 replicator dynamics is used as a governing model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.