Population annealing is a powerful sequential Monte Carlo algorithm designed to study the equilibrium behavior of general systems in statistical physics through massive parallelism. In addition to the remarkable scaling capabilities of the method, it allows for measurements to be enhanced by weighted averaging [J. Machta, Phys. Rev. E 82, 026704 (2010)], admitting to reduce both systematic and statistical errors based on independently repeated simulations. We give a self-contained introduction to population annealing with weighted averaging, generalize the method to a wide range of observables such as the specific heat and magnetic susceptibility and rigorously prove that the resulting estimators for finite systems are asymptotically unbiased for essentially arbitrary target distributions. Numerical results based on more than 10 7 independent population annealing runs of the two-dimensional Ising ferromagnet and the Edwards-Anderson Ising spin glass are presented in depth. In the latter case, we also discuss efficient ways of measuring spin overlaps in population annealing simulations.