We study an interesting variant of the stochastic multi-armed bandit problem, which we call the Fair-MAB problem, where, in addition to the objective of maximizing the sum of expected rewards, the algorithm also needs to ensure that at any time, each arm is pulled at least a pre-specified fraction of times. We investigate the interplay between learning and fairness in terms of a pre-specified vector denoting the fractions of guaranteed pulls. We define a fairness-aware regret, which we call r-Regret, that takes into account the above fairness constraints and extends the conventional notion of regret in a natural way. Our primary contribution is to obtain a complete characterization of a class of Fair-MAB algorithms via two parameters: the unfairness tolerance and the learning algorithm used as a black-box. For this class of algorithms, we provide a fairness guarantee that holds uniformly over time, irrespective of the choice of the learning algorithm. Further, when the learning algorithm is UCB1, we show that our algorithm achieves constant r-Regret for a large enough time horizon. Finally, we analyze the cost of fairness in terms of the conventional notion of regret. We conclude by experimentally validating our theoretical results.
We show an exponential separation between two well-studied models of algebraic computation, namely, read-once oblivious algebraic branching programs (ROABPs) and multilinear depth-three circuits. In particular, we show the following:
(1) There exists an explicit
n
-variate polynomial computable by linear sized multilinear depth-three circuits (with only two product gates) such that every ROABP computing it requires 2
Ω
(n)
size.
(2) Any multilinear depth-three circuit computing IMM
n,d
(the iterated matrix multiplication polynomial formed by multiplying
d
,
n
×
n
symbolic matrices) has
n
Ω(
d
)
size. IMM
n,d
can be easily computed by a poly(
n,d
) sized ROABP.
(3) Further, the proof of (2) yields an exponential separation between multilinear depth-four and multilinear depth-three circuits: There is an explicit
n
-variate, degree
d
polynomial computable by a poly(
n
) sized multilinear depth-four circuit such that any multilinear depth-three circuit computing it has size
n
Ω(d)
. This improves upon the quasi-polynomial separation of Reference [36] between these two models.
The hard polynomial in (1) is constructed using a novel application of expander graphs in conjunction with the evaluation dimension measure [15, 33, 34, 36], while (2) is proved via a new adaptation of the dimension of the partial derivatives measure of Reference [32]. Our lower bounds hold over any field.
An algebraic branching program (ABP) A can be modelled as a product expression X 1 • X 2. .. X d , where X 1 and X d are 1×w and w ×1 matrices respectively, and every other X k is a w ×w matrix; the entries of these matrices are linear forms in m variables over a field F (which we assume to be either Q or a field of characteristic poly(m)). The polynomial computed by A is the entry of the 1 × 1 matrix obtained from the product d k=1 X k. We say A is a full rank ABP if the w 2 (d − 2) + 2w linear forms occurring in the matrices X 1 , X 2 ,. .. , X d are F-linearly independent. Our main result is a randomized reconstruction algorithm for full rank ABPs: Given blackbox access to an m-variate polynomial f of degree at most m, the algorithm outputs a full rank ABP computing f if such an ABP exists, or outputs 'no full rank ABP exists' (with high probability). The running time of the algorithm is polynomial in m and β, where β is the bit length of the coefficients of f. The algorithm works even if X k is a w k−1 × w k matrix (with w 0 = w d = 1), and w = (w 1 ,. .. , w d−1) is unknown.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.