Learning to semantically segment high-resolution remote sensing images

Nogueira, Keiller; Mura, Mauro Dalla; Chanussot, Jocelyn; Schwartz, William Robson; Santos, Jefersson A. dos

doi:10.1109/icpr.2016.7900187

Cited by 30 publications

(54 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Also, for these two datasets, we considered as baselines: (i) Fully Convolutional Networks (FCN) [17]. In this case, the pixelwise architectures proposed by [41] were converted into fully convolutional network and exploited as baseline. (ii) Deconvolutional networks [18], [19].…”

Section: B Baselinesmentioning

confidence: 99%

“…For the Coffee [41] and the GRSS Data Fusion [42] datasets, we employed the same protocol of [41]. Specifically, for the former dataset, we conducted a five-fold cross-validation to assess the performance of the proposed algorithm.…”

Section: Experimental Protocolmentioning

confidence: 99%

“…Although the presented approach is proposed to select automatically the best patch size, in training time, avoiding lots of experiments to adjust such size (as done in several works [8], [26], [41]), in this section, the patch size range is analyzed in order to examine the robustness of the method.…”

Section: Range Analysismentioning

confidence: 99%

See 2 more Smart Citations

Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks

Nogueira

Mura

Chanussot

et al. 2019

IEEE Trans. Geosci. Remote Sensing

Self Cite

127

View full text Add to dashboard Cite

Semantic segmentation requires methods capable of learning high-level features while dealing with large volume of data. Towards such goal, Convolutional Networks can learn specific and adaptable features based on the data. However, these networks are not capable of processing a whole remote sensing image, given its huge size. To overcome such limitation, the image is processed using fixed size patches. The definition of the input patch size is usually performed empirically (evaluating several sizes) or imposed (by network constraint). Both strategies suffer from drawbacks and could not lead to the best patch size. To alleviate this problem, several works exploited multi-context information by combining networks or layers. This process increases the number of parameters resulting in a more difficult model to train. In this work, we propose a novel technique to perform semantic segmentation of remote sensing images that exploits a multi-context paradigm without increasing the number of parameters while defining, in training time, the best patch size. The main idea is to train a dilated network with distinct patch sizes, allowing it to capture multi-context characteristics from heterogeneous contexts. While processing these varying patches, the network provides a score for each patch size, helping in the definition of the best size for the current scenario. A systematic evaluation of the proposed algorithm is conducted using four high-resolution remote sensing datasets with very distinct properties. Our results show that the proposed algorithm provides improvements in pixelwise classification accuracy when compared to state-of-the-art methods.

show abstract

Section: B Baselinesmentioning

confidence: 99%

Section: Experimental Protocolmentioning

confidence: 99%

See 1 more Smart Citation

Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks

Nogueira

Mura

Chanussot

et al. 2019

IEEE Trans. Geosci. Remote Sensing

Self Cite

127

View full text Add to dashboard Cite

show abstract

“…Based on CNNs, many patch-classification methods are proposed to perform semantic labeling (Mnih, 2013;Mostajabi et al, 2015;Paisitkriangkrai et al, 2016;Nogueira et al, 2016;Alshehhi et al, 2017;Zhang et al, 2017). These methods determine a pixel's label by using CNNs to classify a small patch around the target pixel.…”

Section: Introductionmentioning

confidence: 99%

Semantic labeling in very high resolution images via a self-cascaded convolutional neural network

Liu

Fan

Wang

et al. 2018

ISPRS Journal of Photogrammetry and Remote Sensing

253

131

View full text Add to dashboard Cite

Semantic labeling for very high resolution (VHR) images in urban areas, is of significant importance in a wide range of remote sensing applications. However, many confusing manmade objects and intricate fine-structured objects make it very difficult to obtain both coherent and accurate labeling results. For this challenging task, we propose a novel deep model with convolutional neural networks (CNNs), i.e., an end-toend self-cascaded network (ScasNet). Specifically, for confusing manmade objects, ScasNet improves the labeling coherence with sequential global-to-local contexts aggregation. Technically, multi-scale contexts are captured on the output of a CNN encoder, and then they are successively aggregated in a self-cascaded manner. Meanwhile, for fine-structured objects, ScasNet boosts the labeling accuracy with a coarse-to-fine refinement strategy. It progressively refines the target objects using the low-level features learned by CNN's shallow layers. In addition, to correct the latent fitting residual caused by multi-feature fusion inside Scas-Net, a dedicated residual correction scheme is proposed. It greatly improves the effectiveness of ScasNet. Extensive experimental results on three public datasets, including two challenging benchmarks, show that ScasNet achieves the state-of-the-art performance.

show abstract

“…The process of resizing the input data would not only cause the loss of information but would also make it difficult to obtain a direct pixel-to-pixel classification result. For the traditional CNN, we obtained one result for one patch [36], which represented the classification result for one pixel surrounded by that patch, rather than a result at the same size as the patch and each location represented the type of corresponding pixel in the input panchromatic data. Therefore, we revised our residual net in the main line and placed feature maps from the convolution layers directly to the classifier, instead of flatting the feature maps into 1-D structures in advance.…”

Section: Main Linementioning

confidence: 99%

GAN-Assisted Two-Stream Neural Network for High-Resolution Remote Sensing Image Classification

Tao

Zhong

et al. 2017

Remote Sensing

View full text Add to dashboard Cite

Abstract:Using deep learning to improve the capabilities of high-resolution satellite images has emerged recently as an important topic in automatic classification. Deep networks track hierarchical high-level features to identify objects; however, enhancing the classification accuracy from low-level features is often disregarded. We therefore proposed a two-stream deep-learning neural network strategy, with a main stream utilizing fine spatial-resolution panchromatic images to retain low-level information under a supervised residual network structure. An auxiliary line employed an unsupervised net to extract high-level abstract and discriminative features from multispectral images to supplement the spectral information in the main stream. Various feature extraction types from the neural network were selected and jointed in the novel net, as the combined high-and low-level features could provide a superior solution to image classification. In traditional convolutional neural networks, increased network depth might not influence the network performance perceptibly; however, we introduced a residual neural network to develop the expressive ability of the deeper net, increasing the role of net depth in feature extraction. To enhance feature robustness, we proposed a novel consolidation part in feature extraction. An adversarial net improved the feature extraction capabilities and aided digging the inherent and discriminative features from data, with increased extraction efficacy. Tests on satellite images indicated the high overall accuracy of our novel net, verifying that net depth or number of convolution kernels affected the classification capability. Various comparative tests proved the structural rationality for our two-stream structure.

show abstract

Learning to semantically segment high-resolution remote sensing images

Cited by 30 publications

References 31 publications

Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks

Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks

Semantic labeling in very high resolution images via a self-cascaded convolutional neural network

GAN-Assisted Two-Stream Neural Network for High-Resolution Remote Sensing Image Classification

Contact Info

Product

Resources

About