Uwe Stilla scite author profile

We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large receptive fields. However, this success comes at a cost, since the associated loss of effective spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class boundaries explicit in the model. First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the segnet encoder-decoder architecture. Second, we also include boundary detection in fcn-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs in an end-to-end training scheme. Our best model achieves > 90% overall accuracy on the ISPRS Vaihingen benchmark.

show abstract

Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks

Marmanis

Datcu

Esch

et al. 2016

IEEE Geosci. Remote Sensing Lett.

565

277

View full text Add to dashboard Cite

Deep learning methods such as convolutional neural networks (CNNs) can deliver highly accurate classification results when provided with large enough data sets and respective labels. However, using CNNs along with limited labeled data can be problematic, as this leads to extensive overfitting. In this letter, we propose a novel method by considering a pretrained CNN designed for tackling an entirely different classification problem, namely, the ImageNet challenge, and exploit it to extract an initial set of representations. The derived representations are then transferred into a supervised CNN classifier, along with their class labels, effectively training the system. Through this two-stage framework, we successfully deal with the limited-data problem in an end-to-end processing scheme. Comparative results over the UC Merced Land Use benchmark prove that our method significantly outperforms the previously best stated results, improving the overall accuracy from 83.1% up to 92.4%. Apart from statistical improvements, our method introduces a novel feature fusion algorithm that effectively tackles the large data dimensionality by using a simple and computationally efficient approach.

show abstract

Analysis of full waveform LIDAR data for the classification of deciduous and coniferous trees

Reitberger

Krzystek

Stilla³

2008

International Journal of Remote Sensing

269

228

View full text Add to dashboard Cite

Semantic Segmentation of Aerial Images With an Ensemble of CNNS

Marmanis

Wegner

Galliani

et al. 2016

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

200

143

View full text Add to dashboard Cite

ABSTRACT:This paper describes a deep learning approach to semantic segmentation of very high resolution (aerial) images. Deep neural architectures hold the promise of end-to-end learning from raw images, making heuristic feature design obsolete. Over the last decade this idea has seen a revival, and in recent years deep convolutional neural networks (CNNs) have emerged as the method of choice for a range of image interpretation tasks like visual recognition and object detection. Still, standard CNNs do not lend themselves to per-pixel semantic segmentation, mainly because one of their fundamental principles is to gradually aggregate information over larger and larger image regions, making it hard to disentangle contributions from different pixels. Very recently two extensions of the CNN framework have made it possible to trace the semantic information back to a precise pixel position: deconvolutional network layers undo the spatial downsampling, and Fully Convolution Networks (FCNs) modify the fully connected classification layers of the network in such a way that the location of individual activations remains explicit. We design a FCN which takes as input intensity and range data and, with the help of aggressive deconvolution and recycling of early network layers, converts them into a pixelwise classification at full resolution. We discuss design choices and intricacies of such a network, and demonstrate that an ensemble of several networks achieves excellent results on challenging data such as the ISPRS semantic labeling benchmark, using only the raw data as input.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Uwe Stilla

3D segmentation of single trees exploiting full waveform LIDAR data

Classification with an edge: Improving semantic image segmentation with boundary detection

Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks

Analysis of full waveform LIDAR data for the classification of deciduous and coniferous trees

Semantic Segmentation of Aerial Images With an Ensemble of CNNS

Contact Info

Product

Resources

About