Recent works have shown that the contact process running on the top of highly heterogeneous random networks is described by the heterogeneous mean-field theory. However, some important aspects such as the transition point and strong corrections to the finite-size scaling observed in simulations are not quantitatively reproduced in this theory. We develop a heterogeneous pair-approximation, the simplest mean-field approach that takes into account dynamical correlations, for the contact process. The transition points obtained in this theory are in very good agreement with simulations. The proximity with a simple homogeneous pair-approximation is elicited showing that the transition point in successive homogeneous cluster approximations moves away from the simulation results. We show that the critical exponents of the heterogeneous pairapproximation in the infinite-size limit are the same as those of the one-vertex theory. However, excellent matches with simulations, for a wide range of network sizes, are obtained when the sub-leading finite-size corrections given by the new theory are explicitly taken into account. The present approach can be suited to dynamical processes on networks in general providing a profitable strategy to analytically assess and fine-tune theoretical corrections.