The efficient operation of cellular networks requires careful tuning of configuration parameters, such as the transmit power or antenna tilts, to adequately balance interference while providing the necessary capacity to the connected UEs. As manual tuning of these parameters is typically unfeasible, several automated coverage and capacity optimization methods have been proposed. However, most existing solutions are either based on poorly scalable black-box optimization methods or solely consider interference management, while omitting the potential of congested cells. In this work, we instead propose a differentiable framework for cellular network optimization, centered around the end-user throughput, that enables load-aware tuning of network parameters through gradient descent. Hereby, we approach the problem from a data-driven perspective, and include dedicated model subcomponents derived from monitoring data, which enable the calibration to site-specific traffic patterns and KPI measurements. We validate our approach for joint transmit power optimization in a real-world network layout with ≈ 150 cells in two frequency bands. In our evaluation, the gradient descent-based optimization reliably reduces the outage ratio for different levels of demand, while the black-box baseline struggles to explore the large search space. Our results further reveal substantial differences between the proposed load-aware and commonly used SINR-based objectives, for which we repeatably obtain unbalanced network configurations with severely congested cells. In contrast, the proposed end-user throughput objective promotes a balanced network configuration, providing adequate resources to the connected UEs while also limiting interference.