In this paper, we propose a Stack Auto-encoder (SAE)-Driven and Semi-Supervised (SSL)-Based Deep Neural Network (DNN) to extract buildings from relatively low-cost satellite near infrared images. The novelty of our scheme is that we employ only an extremely small portion of labeled data for training the deep model which constitutes less than 0.08% of the total data. This way, we significantly reduce the manual effort needed to complete an annotation process, and thus the time required for creating a reliable labeled dataset. On the contrary, we apply novel semi-supervised techniques to estimate soft labels (targets) of the vast amount of existing unlabeled data and then we utilize these soft estimates to improve model training. Overall, four SSL schemes are employed, the Anchor Graph, the Safe Semi-Supervised Regression (SAFER), the Squared-loss Mutual Information Regularization (SMIR), and an equal importance Weighted Average of them (WeiAve). To retain only the most meaning information of the input data, labeled and unlabeled ones, we also employ a Stack Autoencoder (SAE) trained under an unsupervised manner. This way, we handle noise in the input signals, attributed to dimensionality redundancy, without sacrificing meaningful information. Experimental results on the benchmarked dataset of Vaihingen city in Germany indicate that our approach outperforms all state-of-the-art methods in the field using the same type of color orthoimages, though the fact that a limited dataset is utilized (10 times less data or better, compared to other approaches), while our performance is close to the one achieved by high expensive and much more precise input information like the one derived from Light Detection and Ranging (LiDAR) sensors. In addition, the proposed approach can be easily expanded to handle any number of classes, including buildings, vegetation, and ground.
ABSTRACT:This study aims to detect automatically building points: (a) from LIDAR point cloud using simple techniques of filtering that enhance the geometric properties of each point, and (b) from a point cloud which is extracted applying dense image matching at high resolution colour-infrared (CIR) digital aerial imagery using the stereo method semi-global matching (SGM). At first step, the removal of the vegetation is carried out. At the LIDAR point cloud, two different methods are implemented and evaluated using initially the normals and the roughness values afterwards: (1) the proposed scan line smooth filtering and a thresholding process, and (2) a bilateral filtering and a thresholding process. For the case of the CIR point cloud, a variation of the normalized differential vegetation index (NDVI) is computed for the same purpose. Afterwards, the bare-earth is extracted using a morphological operator and removed from the rest scene so as to maintain the buildings points. The results of the extracted buildings applying each approach at an urban area in northern Greece are evaluated using an existing orthoimage as reference; also, the results are compared with the corresponding classified buildings extracted from two commercial software. Finally, in order to verify the utility and functionality of the extracted buildings points that achieved the best accuracy, the 3D models in terms of Level of Detail 1 (LoD 1) and a 3D building change detection process are indicatively performed on a sub-region of the overall scene.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.