Bounding the price of anarchy, which quantifies the damage to social welfare due to selfish behavior of the participants, has been an important area of research in algorithmic game theory. In this paper, we study this phenomenon in the context of a game modeling queuing systems: routers compete for servers, where packets that do not get service will be resent at future rounds, resulting in a system where the number of packets at each round depends on the success of the routers in the previous rounds. We model this as an (infinitely) repeated game, where the system holds a state (number of packets held by each queue) that arises from the results of the previous rounds. We assume that routers satisfy the no-regret condition, e.g. they use learning strategies to identify the server where their packets get the best service.Classical work on repeated games makes the strong assumption that the subsequent rounds of the repeated games are independent (beyond the influence on learning from past history). The carryover effect caused by packets remaining in this system makes learning in our context result in a highly dependent random process. We analyze this random process and find that if the capacity of the servers is high enough to allow a centralized and knowledgeable scheduler to get all packets served even when service rates are halved, and queues use no-regret learning algorithms, then the expected number of packets in the queues will remain bounded throughout time, assuming older packets have priority. This paper is the first to study the effect of selfish learning in a queuing system, where the learners compete for resources, but rounds are not all independent: the number of packets to be routed at each round depends on the success of the routers in the previous rounds. CCS Concepts: • Mathematics of computing → Queueing theory; • Theory of computation → Online learning theory; Multi-agent learning; Regret bounds; Quality of equilibria; Convergence and learning in games.
We consider the problem of selfish agents in discrete-time queuing systems, where competitive queues try to get their packets served. In this model, a queue gets to send a packet each step to one of the servers, which will attempt to serve the oldest arriving packet, and unprocessed packets are returned to each queue. We model this as a repeated game where queues compete for the capacity of the servers, but where the state of the game evolves as the length of each queue varies, resulting in a highly dependent random process. In classical work for learning in repeated games, the learners evaluate the outcome of their strategy in each step-in our context, this means that queues estimate their success probability at each server. Earlier work by the authors [in EC'20] shows that with no-regret learners, the system needs twice the capacity as would be required in the coordinated setting to ensure queue lengths remain stable despite the selfish behavior of the queues. In this paper, we demonstrate that this myopic way of evaluating outcomes is suboptimal: if more patient queues choose strategies that selfishly maximize their long-run success rate, stability can be ensured with just −1 ≈ 1.58 times extra capacity, strictly better than what is possible assuming the no-regret property.As these systems induce highly dependent random processes, our analysis draws heavily on techniques from the theory of stochastic processes to establish various game-theoretic properties of these systems. Though these systems are random even under fixed stationary policies by the queues, we show using careful probabilistic arguments that surprisingly, under such fixed policies, these systems have essentially deterministic and explicit asymptotic behavior. We show that the growth rate of a set can be written as the ratio of a submodular and modular function, and use the resulting explicit description to show that the subsets of queues with largest growth rate are closed under union and non-disjoint intersections, which we use in turn to prove the claimed sharp bicriteria result for the equilibria of the resulting system. Our equilibrium analysis relies on a novel deformation argument towards a more analyzable solution that is quite different from classical price of anarchy bounds. While the intermediate points in this deformation will not be Nash, the structure will ensure the relevant constraints and incentives similarly hold to establish monotonicity along this continuous path.CCS Concepts: • Theory of computation → Quality of equilibria; Algorithmic game theory.
We study the aggregate welfare and individual regret guarantees of dynamic pacing algorithms in the context of repeated auctions with budgets. Such algorithms are commonly used as bidding agents in Internet advertising platforms, adaptively learning to shade bids in order to match a specified spend target. We show that when agents simultaneously apply a natural form of gradient-based pacing, the liquid welfare obtained over the course of the learning dynamics is at least half the optimal expected liquid welfare obtainable by any allocation rule. This result is robust to the correlation structure between agent valuations and holds for any core auction, a broad class of auctions that includes first-price, second-price, and generalized second-price auctions. Moreover, these results hold without requiring convergence of the dynamics, allowing us to circumvent known complexity-theoretic obstacles of finding equilibria. For individual guarantees, we further show such pacing algorithms enjoy dynamic regret bounds for individual value maximization, with respect to the sequence of budget-pacing bids, for any auction satisfying a monotone bang-for-buck property. This generalizes known regret guarantees for bidders facing stochastic bidding environments in two ways: it applies to a wider class of auctions than previously known, and it allows the environment to change over time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.