Due to its rapid deployment, high-flexibility, and high-accuracy advantages, the unmanned-aerial-vehicle (UAV)-based differential synthetic aperture radar (SAR) tomography (D-TomoSAR) technique presents an attractive approach for urban risk monitoring. With its sufficiently long spatial and temporal baselines, it offers elevation and velocity resolution beyond the dimensions of range and azimuth, enabling four-dimensional (4D) SAR imaging. In the case of P-band UAV-SAR, a long spatial-temporal baseline is necessary to achieve high enough elevation-velocity dimensional resolution. Although P-band UAV-SAR maintains temporal coherence, it still faces two issues due to the extended spatial baseline, i.e., low spatial coherence and high sidelobes. To tackle these problems, we introduce a multi-master (MM) D-TomoSAR approach, contributing three main points. Firstly, the traditional D-TomoSAR signal model is extended to a MM one, which improves the average coherence coefficient and the number of baselines (NOB) as well as suppresses sidelobes. Secondly, a baseline distribution optimization processing is proposed to equalize the spatial–temporal baseline distribution, achieve more uniform spectrum samplings, and reduce sidelobes. Thirdly, a clustering-based outlier elimination method is employed to ensure 4D imaging quality. The proposed method is effectively validated through computer simulation and P-band UAV-SAR experiment.