In this paper, we consider the problem of powerefficient distributed estimation of a localized event in the largescale Wireless Sensor Networks (WSNs). In order to increase the power efficiency in these networks, we develop a joint optimization problem that involves both selecting a subset of active sensors and the routing structure so that the quality of estimation at a given querying node is the best possible subject to a total imposed communication cost. We first formulate our problem as an optimization problem and show that it is NP-Hard. Then, we design two algorithms: a fixed-tree relaxation-based and a novel and very efficient iterative distributed to optimize jointly the sensor selection and the routing structure. We also provide a lower bound for our optimization problem and show that our iterative distributed algorithm provides a performance that is close to this bound. Although there is no guarantee that the gap between this lower bound and the optimal solution of the main problem is always small, our numerical experiments support that this gap is actually very small in many cases. An important result from our work is the fact that because of the interplay between communication cost and gain in estimation when fusing measurements from different sensors, the traditional Shortest Path Tree (SPT) routing structure, widely used in practice, is no longer optimal, that is, our routing structures provide a better trade-off between the overall communication cost and estimation accuracy. Comparing to more conventional sensor selection and fixed routing algorithms, our proposed joint sensor selection and routing algorithms yield a significant amount of energy saving.