New aerial sensors and platforms (e.g., unmanned aerial vehicles (UAVs)) are capable of providing ultra-high resolution remote sensing data (less than a 30-cm ground sampling distance (GSD)). This type of data is an important source for interpreting sub-building level objects; however, it has not yet been explored. The large-scale differences of urban objects, the high spectral variability and the large perspective effect bring difficulties to the design of descriptive features. Therefore, features representing the spatial information of the objects are essential for dealing with the spectral ambiguity. In this paper, we proposed a dual morphology top-hat profile (DMTHP) using both morphology reconstruction and erosion with different granularities. Due to the high dimensional feature space, we have proposed an adaptive scale selection procedure to reduce the feature dimension according to the training samples. The DMTHP is extracted from both images and Digital Surface Models (DSM) to obtain complimentary information. The random forest classifier is used to classify the features hierarchically. Quantitative experimental results on aerial images with 9-cm and UAV images with 5-cm GSD are performed. Under our experiments, improvements of 10% and 2% in overall accuracy are obtained in comparison with the well-known differential morphological profile (DMP) feature, and superior performance is observed over other tested features. Large format data with 20,000 × 20,000 pixels are used to perform a qualitative experiment using the proposed method, which shows its promising potential. The experiments also demonstrate that the DSM information has greatly enhanced the classification accuracy. In the best case in our experiment, it gives rise to a classification accuracy from 63.93% (spectral information only) to 94.48% (the proposed method).