A fundamental challenge in wireless heterogeneous networks (HetNets) is to effectively utilize the limited transmission and storage resources in the presence of increasing deployment density and backhaul capacity constraints. To alleviate bottlenecks and reduce resource consumption, we design optimal caching and power control algorithms for multi-hop wireless HetNets. We formulate a joint optimization framework to minimize the average transmission delay as a function of the caching variables and the signal-to-interference-plus-noise ratios (SINR) which are determined by the transmission powers, while explicitly accounting for backhaul connection costs and the power constraints.Using convex relaxation and rounding, we obtain a reducedcomplexity formulation (RCF) of the joint optimization problem, which can provide a constant factor approximation to the globally optimal solution. We then solve RCF in two ways: 1) alternating optimization of the power and caching variables by leveraging biconvexity, and 2) joint optimization of power control and caching. We characterize the necessary (KKT) conditions for an optimal solution to RCF, and use strict quasi-convexity to show that the KKT points are Pareto optimal for RCF. We then devise a subgradient projection algorithm to jointly update the caching and power variables, and show that under appropriate conditions, the algorithm converges at a linear rate to the local minima of RCF, under general SINR conditions. We support our analytical findings with results from extensive numerical experiments.