for detailed comments. 2 Equilibrium Interactions 2.1 Strategic Bandits Strategic bandit models are game-theoretic versions of standard bandit models. While the standard "multi-armed bandit" describes a hypothetical experiment in which a player faces several slot machines ("one-armed bandits") with potentially different expected payouts, a strategic bandit involves several players facing (usually, identical) copies of the same slot machine. Players want to stick with the slot machine if and only if the best payout rate makes it worth their time, and learn not only from their own outcomes but also from their neighbors. Equilibrium strategies are not characterized by simple cutoffs in terms of the common belief. As a result, solutions to strategic bandits are only known for a limited class of distributions, involving two states of the world only. In Bolton and Harris (1999, BH), the observation process (of payoffs) follows a Brownian motion, whose drift depends on the state. In Keller, Rady and Cripps (2005, KRC), it follows a simple Poisson process, with positive lump-sums ("breakthroughs") occurring at random (exponentially distributed) times if and only if the arm is good. Keller and Rady (2015, KR15) solve the polar opposite case in which costly lump-sums ("breakdowns") occur at random times if and only if the arm is bad. Keller and Rady (2010, KR10) consider the case in which breakthroughs need not be conclusive. These models share in common, in addition to the binary state framework, their focus on symmetric Markov perfect equilibria (MPE). 1 Throughout, players are Bayesian. Given that they observe all actions and outcomes, they share a common belief about the state, which serves as the state variable. They are also impatient and share a common discount rate. BH and KR10 are the most ambitious models and offer no closed-form solutions for the equilibrium. Remarkably, however, they are able to prove uniqueness and tease out not only their structure, but their dependence on parameters. While it is BH that first develops both the ideas (including the concepts of free-riding and encouragement effects) as well as the methods used throughout this literature, the most interesting insights can already be gleaned from the simple exponential bandits in KRC and KR15. What makes these two models tractable is that these models can be viewed as deterministic: unless a breakdown or breakthrough ("news") occurs, the (conditional) posterior belief follows a known path. If news arrives, the game is over, since if it is commonly known that the state is good (or bad), informational externalities cease to matter, and each player knows the strictly dominant action to take. Let us consider here a simplified version combining the good and bad news models. The state is ω ∈ {G, B}. Each player i = 1,. .. , I controls the variable u i t ∈ [0, 1], which is the fraction allocated to the risky arm at time t ≥ 0 (the complementary fraction being allocated to the safe arm). The horizon is infinite. This leads to a total realized payoff of ...