Ksenia Bittner scite author profile

Automatic building extraction and delineation from high-resolution satellite imagery is an important but very challenging task, due to the extremely large diversity of building appearances. Nowadays, it is possible to use multiple high-resolution remote sensing data sources which allow the integration of different information in order to improve the extraction accuracy of building outlines. Many algorithms are built on spectral-based or appearance-based criteria, from single or fused data sources, to perform the building footprint extraction. But the features for these algorithms are usually manually extracted, which limits their accuracy. Recently developed fully convolutional networks (FCNs), which are similar to normal convolutional neural networks (CNNs), but the last fully connected layer is replaced by another convolution layer with a large "receptive field", quickly became the state-of-the-art method for image recognition tasks, as they bring the possibility to perform dense pixel-wise classification of input images. Based on these advantages, i.e., the automatic extraction of relevant features, and dense classification of images, we propose an end-to-end fully convolutional network (FCN) which effectively combines the spectral and height information from different data sources and automatically generates a full resolution binary building mask. Our architecture (FUSED-FCN4S) consists of three parallel networks merged at a late stage, which helps propagating fine detailed information from earlier layers to higher-levels, in order to produce an output with more accurate building outlines. The inputs to the proposed Fused-FCN4s are three-band (RGB), panchromatic (PAN), and normalized digital surface model (nDSM) images. Experimental results demonstrate that the fusion of several networks is able to achieve excellent results on complex data. Moreover, the developed model was successfully applied to different cities to show its generalization capacity.

show abstract

Building Extraction From Remote Sensing Data Using Fully Convolutional Networks

Bittner

Cui

Reinartz

2017

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

ABSTRACT:Building detection and footprint extraction are highly demanded for many remote sensing applications. Though most previous works have shown promising results, the automatic extraction of building footprints still remains a nontrivial topic, especially in complex urban areas. Recently developed extensions of the CNN framework made it possible to perform dense pixel-wise classification of input images. Based on these abilities we propose a methodology, which automatically generates a full resolution binary building mask out of a Digital Surface Model (DSM) using a Fully Convolution Network (FCN) architecture. The advantage of using the depth information is that it provides geometrical silhouettes and allows a better separation of buildings from background as well as through its invariance to illumination and color variations. The proposed framework has mainly two steps. Firstly, the FCN is trained on a large set of patches consisting of normalized DSM (nDSM) as inputs and available ground truth building mask as target outputs. Secondly, the generated predictions from FCN are viewed as unary terms for a Fully connected Conditional Random Fields (FCRF), which enables us to create a final binary building mask. A series of experiments demonstrate that our methodology is able to extract accurate building footprints which are close to the buildings original shapes to a high degree. The quantitative and qualitative analysis show the significant improvements of the results in contrast to the multy-layer fully connected network from our previous work.

show abstract

DSM-to-LoD2: Spaceborne Stereo Digital Surface Model Refinement

et al. 2018

View full text Add to dashboard Cite

A digital surface model (DSM) provides the geometry and structure of an urban environment with buildings being the most prominent objects in it. Built-up areas change with time due to the rapid expansion of cities. New buildings are being built, existing ones are expanded, and old buildings are torn down. As a result, 3D surface models can increase the understanding and explanation of complex urban scenarios. They are very useful in numerous fields of remote sensing applications, in tasks related to 3D reconstruction and city modeling, planning, visualization, disaster management, navigation, and decision-making, among others. DSMs are typically derived from various acquisition techniques, like photogrammetry, laser scanning, or synthetic aperture radar (SAR). The generation of DSMs from very high resolution optical stereo satellite imagery leads to high resolution DSMs which often suffer from mismatches, missing values, or blunders, resulting in coarse building shape representation. To overcome these problems, we propose a method for 3D surface model generation with refined building shapes to level of detail (LoD) 2 from stereo half-meter resolution satellite DSMs using deep learning techniques. Mainly, we train a conditional generative adversarial network (cGAN) with an objective function based on least square residuals to generate an accurate LoD2-like DSM with enhanced 3D object shapes directly from the noisy stereo DSM input. In addition, to achieve close to LoD2 shapes of buildings, we introduce a new approach to generate an artificial DSM with accurate and realistic building geometries from city geography markup language (CityGML) data, on which we later perform a training of the proposed cGAN architecture. The experimental results demonstrate the strong potential to create large-scale remote sensing elevation models where the buildings exhibit better-quality shapes and roof forms than just using the matching process. Moreover, the developed model is successfully applied to a different city that is unseen during the training to show its generalization capacity.2 of 20 building shapes, including the recovery of disturbed boundaries and robust reconstruction of precise rooftop geometries, is in demand.Remote sensing technology provides several ways to measure the 3D urban morphology. Conventional ground surveying, stereo airborne or satellite photogrammetry, interferometric synthetic aperture radar (InSAR), and light detection and ranging (LIDAR) are the main data sources used to obtain high-resolution elevation information [1]. The main advantage of digital surface models (DSMs) generated using ground surveying and LIDAR is their good quality and detailed object representations. However, their production is costly and time consuming, and covers relatively small areas compared with images produced with spaceborne remote sensing [2]. SAR imagery is operational in all seasons under different weather conditions. Nevertheless it has a side-looking sensor principle that is not so useful for building recognition a...

show abstract

Automatic Building Footprint Extraction from Multi-Resolution Remote Sensing Images Using a Hybrid FCN

Schuegraf

Bittner

2019

IJGI

View full text Add to dashboard Cite

Recent technical developments made it possible to supply large-scale satellite image coverage. This poses the challenge of efficient discovery of imagery. One very important task in applications like urban planning and reconstruction is to automatically extract building footprints. The integration of different information, which is presently achievable due to the availability of high-resolution remote sensing data sources, makes it possible to improve the quality of the extracted building outlines. Recently, deep neural networks were extended from image-level to pixel-level labelling, allowing to densely predict semantic labels. Based on these advances, we propose an end-to-end U-shaped neural network, which efficiently merges depth and spectral information within two parallel networks combined at the late stage for binary building mask generation. Moreover, as satellites usually provide high-resolution panchromatic images, but only low-resolution multi-spectral images, we tackle this issue by using a residual neural network block. It fuses those images with different spatial resolution at the early stage, before passing the fused information to the Unet stream, responsible for processing spectral information. In a parallel stream, a stereo digital surface model (DSM) is also processed by the Unet. Additionally, we demonstrate that our method generalizes for use in cities which are not included in the training data.

show abstract

Multi-Task cGAN for Simultaneous Spaceborne DSM Refinement and Roof-Type Classification

et al. 2019

View full text Add to dashboard Cite

Various deep learning applications benefit from multi-task learning with multiple regression and classification objectives by taking advantage of the similarities between individual tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models compared to separately trained models. In this paper, we make an observation of such influences for important remote sensing applications like elevation model generation and semantic segmentation tasks from the stereo half-meter resolution satellite digital surface models (DSMs). Mainly, we aim to generate good-quality DSMs with complete, as well as accurate level of detail (LoD)2-like building forms and to assign an object class label to each pixel in the DSMs. For the label assignment task, we select the roof type classification problem to distinguish between flat, non-flat, and background pixels. To realize those tasks, we train a conditional generative adversarial network (cGAN) with an objective function based on least squares residuals and an auxiliary term based on normal vectors for further roof surface refinement. Besides, we investigate recently published deep learning architectures for both tasks and develop the final end-to-end network, which combines different models, as using them first separately, they provide the best results for their individual tasks.which the object belongs. Mainly, building footprint extraction or roof type classification is one of the most challenging, but important problems. It is common to use DSMs as input data for classification tasks regarding buildings [6,7], as depth information provides geometrical silhouettes and allows a better understanding of building forms. Although a vast amount of attempts have already been made on accurate pixel-wise classification [8,9], it still remains a challenging task in practice due to the wide variety of building appearances.In most cases, each task, e.g., depth image generation and pixel-wise image classification, is tackled independently, although they are closely connected. Solving those multiple tasks jointly can enhance the performance of each independent task, as well as speed up computation time. This observation leads to the advantages of multi-task (MT) learning. The approach of simultaneously improving the generalization performance of multiple outputs from a single input was applied to numerous machine learning techniques. As a promising concept for convolutional neural networks (CNNs), MT learning has been proven to leverage a variety of problems successfully, like classification and semantic segmentation [10] or classification and object detection [11]. Due to the fact that different tasks may conflict, MT learning is regarded as the optimization of MT loss, which minimizes a linear combination of contributed single-task loss functions.In this work, we aim to produce good-quality LoD2-like DSMs with realistic building geometries together with dense pixel-wise rooftop classification masks, defining multiple classes, like ground, flat, and ...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ksenia Bittner

Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks

Building Extraction From Remote Sensing Data Using Fully Convolutional Networks

DSM-to-LoD2: Spaceborne Stereo Digital Surface Model Refinement

Automatic Building Footprint Extraction from Multi-Resolution Remote Sensing Images Using a Hybrid FCN

Multi-Task cGAN for Simultaneous Spaceborne DSM Refinement and Roof-Type Classification

Contact Info

Product

Resources

About