When compared to tracking airborne targets, tracking ground targets on urban terrains brings a new set of challenges. Target mobility is constrained by road networks, and the quality of measurements is affected by dense clutter, multipath, and limited line-of-sight. We investigate the integration of detection, signal processing, tracking, and scheduling by exploiting distinct levels of diversity: (1) spatial diversity through the use of coordinated multistatic radars; (2) waveform diversity by adaptively scheduling the transmitted radar waveform according to the scene conditions; and (3) motion model diversity by using a bank of parallel filters, each one matched to a different maneuvering model. Specifically, at each scan, the waveform that yields the minimum one-step-ahead error covariance matrix determinant is transmitted; the received signal is then matched-filtered, and quadratic curve fitting is applied to extract range and azimuth measurements that are input to the LMIPDA-VSIMM algorithm for data association and filtering. Monte Carlo simulations are used to demonstrate the effectiveness of the proposed system on a realistic urban scenario. A more traditional open-loop system, in which waveforms are scheduled on a round-robin fashion and with no other modes of diversity available, is used as a baseline for comparison. Simulation results show that our closed-loop system significantly outperforms the baseline system, presenting both a reduction on the number of lost tracks, and a reduction on the volume of the estimation uncertainty ellipse. The interdisciplinary nature of this work highlights the challenges involved in designing a closed-loop active sensing platform for next-generation urban tracking systems.