Two parties observe independent copies of a d-dimensional vector and a scalar. They seek to test if their data is correlated or not, namely they seek to test if the norm ρ 2 of the correlation vector ρ between their observations exceeds τ or is it 0. To that end, they communicate interactively and declare the output of the test. We show that roughly order d/τ 2 bits of communication are sufficient and necessary for resolving the distributed correlation testing problem above. Furthermore, we establish a lower bound of roughly d 2 /τ 2 bits for communication needed for distributed correlation estimation, rendering the estimate-and-test approach suboptimal in communication required for distributed correlation testing. For the one-dimensional case with one-way communication, our bounds are tight even in the constant and provide a precise dependence of communication complexity on the probabilities of error of two types.
I. INTRODUCTIONParties P 1 and P 2 observe jointly Gaussian random variables X n and Y n , respectively, comprising independent and identically distributed (i.i.d.) samplesThey communicate with each other to determine if their observations are correlated, i.e., to test if ρ 2 ≥ τ or ρ 2 = 0. For a given probability of error requirement and an arbitrary large n, what is the minimum communication needed between the parties? Note that we have chosen the distribution to be Gaussian just for convenience. Since we allow the number of samples to be arbitrarily large, even when X and Y are not Gaussian, we can replace subset of samples with their sample means and use the central limit theorem (Berry-Esseen approximation) to do similar calculations as those presented in this paper. Indeed, all the results of this paper extend to the case when X t and Y t are distributed uniformly over {−1, 1} d and {−1, 1}, respectively, and