We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as t → ∞) it pulls only the arm with the highest expected reward. While this goal is provably impossible for an isolated individual due to bounded memory, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion (i.e., communication) as long as the communication networks/graphs satisfy some mild conditions. In this work, we model and analyze a type of learning dynamics which are well-observed in social groups. Specifically, under the learning dynamics of interest, an individual sequentially decides on which arm to pull next based on not only its private reward feedback but also the suggestion provided by a randomly chosen neighbor. To deal with the interplay between the randomness in the rewards and in the social interaction, we employ the mean-field approximation method. Considering the possibility that the individuals in the networks may not be exchangeable when the communication networks are not cliques, we go beyond the classic mean-field techniques and apply a refined version of mean-field approximation:• Using coupling we show that, if the communication graph is connected and is either regular or has doubly-stochastic degree-weighted adjacency matrix, with probability → 1 as the social group size N → ∞, every individual in the social group learns the best option.• If the minimum degree of the graph diverges as N → ∞, over an arbitrary but given finite time horizon, the sample paths describing the opinion evolutions of the individuals are asymptotically independent. In addition, the proportions of the population with different opinions converge to the unique solution of a system of ODEs. Interestingly, the obtained system of ODEs are invariant to the structures of the communication graphs. In the solution of the obtained ODEs, the proportion of the population holding the correct opinion converges to 1 exponentially fast in time.Notably, our results hold even if the communication graphs are highly sparse.
We examine a canonical scenario where several wireless data sources generate sporadic delay-sensitive messages that need to be transmitted to a common access point. The access point operates in a time-slotted fashion, and can instruct the various sources in each slot with what probability to transmit a message, if they have any. When several sources transmit simultaneously, the access point can detect a collision, but is unable to infer the identities of the sources involved. While the access point can use the channel activity observations to obtain estimates of the queue states at the various sources, it does not have any explicit queue length information otherwise. We explore the achievable delay performance in a regime where the number of sources n grows large while the relative load remains fixed. We establish that, under any medium access algorithm without queue state information, the average delay must be at least of the order of n slots when the load exceeds some threshold lambda* < 1. This demonstrates that bounded delay can only be achieved if a positive fraction of the system capacity is sacrificed. Furthermore, we introduce a scalable Two-Phase algorithm which achieves a delay upper bounded uniformly in n when the load is below e -1 , and a delay of the order of n slots when the load is between e -1 and 1. Additionally, this algorithm provides robustness against correlated source activity.
We consider a large distributed service system consisting of n homogeneous servers with infinite capacity FIFO queues. Jobs arrive as a Poisson process of rate λn/k_n (for some positive constant λ and integer k_n). Each incoming job consists of k_n identical tasks that can be executed in parallel, and that can be encoded into at least k_n "replicas" of the same size (by introducing redundancy) so that the job is considered to be completed when any k_n replicas associated with it finish their service. Moreover, we assume that servers can experience random slowdowns in their processing rate so that the service time of a replica is the product of its size and a random slowdown. First, we assume that the server slowdowns are shifted exponential and independent of the replica sizes. In this setting we show that the delay of a typical job is asymptotically minimized (as $n\to\infty$) when the number of replicas per task is a constant that only depends on the arrival rate λ, and on the expected slowdown of servers. Second, we introduce a new model for the server slowdowns in which larger tasks experience less variable slowdowns than smaller tasks. In this setting we show that, under the class of policies where all replicas start their service at the same time, the delay of a typical job is asymptotically minimized (as n\to\infty) when the number of replicas per task is made to depend on the actual size of the tasks being replicated, with smaller tasks being replicated more than larger tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.