Extra Samples can Reduce the Communication for Independence Testing

Sahasranand, K. R.; Tyagi, Himanshu

doi:10.1109/isit.2018.8437584

Cited by 14 publications

(21 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lower bounds on the accuracy of learning procedures with limited memory and communication have been explored in several settings, including mean estimation, sparse regression, learning parities, detecting correlations, and independence testing (Shamir, 2014;Duchi et al, 2014;Garg et al, 2014;Steinhardt and Duchi, 2015;Braverman et al, 2016;Steinhardt et al, 2016;Acharya et al, 2018a,b;Raz, 2018;Han et al, 2018;Sahasranand and Tyagi, 2018;Dagan and Shamir, 2018;Dagan et al, 2019). In particular, the results of Steinhardt and Duchi (2015) and Braverman et al (2016) imply that optimal algorithms for distributed sparse regression need communication much larger than the sparsity level under various assumptions on the number of machines and communication protocol.…”

Section: Related Workmentioning

confidence: 99%

Distributed Learning with Sublinear Communication

Acharya¹,

De²,

Foster³

et al. 2019

Preprint

View full text Add to dashboard Cite

In distributed statistical learning, N samples are split across m machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the number of features ("dimension"), the speedup afforded by distributed learning may be overshadowed by the cost of communicating a single example. This paper investigates the following question: When is it possible to learn a d-dimensional model in the distributed setting with total communication sublinear in d?Starting with a negative result, we observe that for learning ℓ 1 -bounded or sparse linear models, no algorithm can obtain optimal error until communication is linear in dimension. Our main result is that that by slightly relaxing the standard boundedness assumptions for linear models, we can obtain distributed algorithms that enjoy optimal error with communication logarithmic in dimension. This result is based on a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates, and extends to the general stochastic convex optimization model.

show abstract

Section: Related Workmentioning

confidence: 99%

Distributed Learning with Sublinear Communication

Acharya¹,

De²,

Foster³

et al. 2019

Preprint

View full text Add to dashboard Cite

show abstract

“…The problem of hypothesis testing with communication constraints was introduced by Berger in [1] and was addressed by several authors where the vast majority of works deals with achievability schemes, e.g., [2], [3], [4], [5], [6], [7]. Several extensions where proposed e.g.…”

Section: Introductionmentioning

confidence: 99%

Error Exponents in Distributed Hypothesis Testing of Correlations

Hadar

Liu

Polyanskiy

et al. 2019

2019 IEEE International Symposium on Information Theory (ISIT)

View full text Add to dashboard Cite

We study a distributed hypothesis testing problem where two parties observe i.i.d. samples from two ρ-correlated standard normal random variables X and Y . The party that observes the X-samples can communicate R bits per sample to the second party, that observes the Y -samples, in order to test between two correlation values. We investigate the best possible type-II error subject to a fixed type-I error, and derive an upper (impossibility) bound on the associated type-II error exponent. Our techniques include representing the conditional Y -samples as a trajectory of the Ornstein-Uhlenbeck process, and bounding the associated KL divergence using the subadditivity of the Wasserstein distance and the Gaussian Talagrand inequality.

show abstract

“…This, too, will result in a scheme that requires O(1/τ 2 ) bits of communication. However, we noted in [1], where we study the communication complexity of one-dimensional independence testing, that our proposed scheme uses communication that is a constant factor lower that this baseline scheme.…”

mentioning

confidence: 98%

“…We invoke this result to show that roughly d/τ 2 bits of communication are needed even when interactive communication is allowed, rendering our proposed one-way communication protocol optimal among interactive protocols. We note that this bound is slightly weaker for one-way communication than the 1 As will be seen below, our proposed test uses a "median trick" to convert the one-dimensional test to a d-dimensional test. In our simulation, even the probabilities of correctness for the one-dimensional test are boosted to the desired levels by repeating the tests and using a similar "median trick".…”

mentioning

confidence: 98%

“…Error exponent for the conditional independence testing problem is studied in [12], where both upper and lower bound for it are obtained. Recently, and subsequent to the publication of the initial version [1] of this paper, related problems were considered in various works. In [13], an improved upper bound on the Stein exponent for testing between two known positive Gaussian correlations is provided.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Communication Complexity of Distributed High Dimensional Correlation Testing

Sahasranand

Tyagi

2020

Preprint

Self Cite

View full text Add to dashboard Cite

Two parties observe independent copies of a d-dimensional vector and a scalar. They seek to test if their data is correlated or not, namely they seek to test if the norm ρ 2 of the correlation vector ρ between their observations exceeds τ or is it 0. To that end, they communicate interactively and declare the output of the test. We show that roughly order d/τ 2 bits of communication are sufficient and necessary for resolving the distributed correlation testing problem above. Furthermore, we establish a lower bound of roughly d 2 /τ 2 bits for communication needed for distributed correlation estimation, rendering the estimate-and-test approach suboptimal in communication required for distributed correlation testing. For the one-dimensional case with one-way communication, our bounds are tight even in the constant and provide a precise dependence of communication complexity on the probabilities of error of two types. I. INTRODUCTIONParties P 1 and P 2 observe jointly Gaussian random variables X n and Y n , respectively, comprising independent and identically distributed (i.i.d.) samplesThey communicate with each other to determine if their observations are correlated, i.e., to test if ρ 2 ≥ τ or ρ 2 = 0. For a given probability of error requirement and an arbitrary large n, what is the minimum communication needed between the parties? Note that we have chosen the distribution to be Gaussian just for convenience. Since we allow the number of samples to be arbitrarily large, even when X and Y are not Gaussian, we can replace subset of samples with their sample means and use the central limit theorem (Berry-Esseen approximation) to do similar calculations as those presented in this paper. Indeed, all the results of this paper extend to the case when X t and Y t are distributed uniformly over {−1, 1} d and {−1, 1}, respectively, and

show abstract

Extra Samples can Reduce the Communication for Independence Testing

Cited by 14 publications

References 20 publications

Distributed Learning with Sublinear Communication

Distributed Learning with Sublinear Communication

Error Exponents in Distributed Hypothesis Testing of Correlations

Communication Complexity of Distributed High Dimensional Correlation Testing

Contact Info

Product

Resources

About