We study a multi-objective multi-armed bandit problem in a dynamic environment. The problem portrays a decision-maker that sequentially selects an arm from a given set. If selected, each action produces a reward vector, where every element follows a piecewise-stationary Bernoulli distribution. The agent aims at choosing an arm among the Pareto optimal set of arms to minimize its regret. We propose a Pareto generic upper confidence bound (UCB)-based algorithm with change detection to solve this problem. By developing the essential inequalities for multi-dimensional spaces, we establish that our proposal guarantees a regret bound in the order of γT log(T /γT ) when the number of breakpoints γT is known. Without this assumption, the regret bound of our algorithm is γT log(T ). Finally, we formulate an energy-efficient waveform design problem in an integrated communication and sensing system as a toy example. Numerical experiments on the toy example and synthetic and real-world datasets demonstrate the efficiency of our policy compared to the current methods.
There is growing interest in applying distributed machine learning to edge computing, forming federated edge learning. Compared with conventional distributed machine learning in a datacenter, federated edge learning faces non-independent and identically distributed (non-i.i.d.) and heterogeneous data, and the communications between edge workers, possibly through distant locations with unstable wireless networks, are more costly than their local computational overhead. In this work, we propose a distributed Newton-type algorithm (DONE) with fast convergence rate for communication-efficient federated edge learning. First, with strongly convex and smooth loss functions, we show that DONE can produce the Newton direction approximately in a distributed manner by using the classical Richardson iteration on each edge worker. Second, we prove that DONE has linear-quadratic convergence and analyze its computation and communication complexities. Finally, the experimental results with non-i.i.d. and heterogeneous data show that DONE attains the same performance as the Newton's method. Notably, DONE requires considerably fewer communication iterations compared to the distributed gradient descent algorithm and outperforms DANE, a state-of-the-art, in the case of non-quadratic loss functions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.