We consider the problem of recovering a signal x * ∈ R n , from magnitude-only measurements, yi = | ai, x * | for i = {1, 2, . . . , m}. This is a stylized version of the classical phase retrieval problem, and is a fundamental challenge in nano-and bio-imaging systems, astronomical imaging, and speech processing. It is well known that the above problem is ill-posed, and therefore some additional assumptions on the signal and/or the measurements are necessary.In this paper, we consider the case where the underlying signal x * is s-sparse. For this case, we develop a novel recovery algorithm that we call Compressive Phase Retrieval with Alternating Minimization, or CoPRAM. Our algorithm is simple and be obtained via a natural combination of the classical alternating minimization approach for phase retrieval with the CoSaMP algorithm for sparse recovery. Despite its simplicity, we prove that our algorithm achieves a sample complexity of O s 2 log n with Gaussian measurements ai, which matches the best known existing results; moreover, it also demonstrates linear convergence in theory and practice. An appealing feature of our algorithm is that it requires no extra tuning parameters other than the signal sparsity level s. Moreover, we show that our algorithm is robust to noise.The quadratic dependence of sample complexity on the sparsity level is sub-optimal, and we demonstrate how to alleviate this via additional assumptions beyond sparsity. First, we study the (practically) relevant case where the sorted coefficients of the underlying sparse signal exhibit a power law decay. In this scenario, we show that the CoPRAM algorithm achieves a sample complexity of O (s log n), which is close to the information-theoretic limit.We then consider the case where the underlying signal x * arises from structured sparsity models. We specifically examine the case of block-sparse signals with uniform block size of b and block sparsity k = s/b. For this problem, we design a recovery algorithm that we call Block CoPRAM that further reduces the sample complexity to O (ks log n).For sufficiently large block lengths of b = Θ(s), this bound equates to O (s log n).To our knowledge, our approach constitutes the first family of linearly convergent algorithms for signal recovery from magnitude-only Gaussian measurements that exhibit a sub-quadratic dependence on the signal sparsity level.