Creating 3D models of the static environment is an important task for the advancement of driver assistance systems and autonomous driving. In this work, a static reference map is created from a Mobile Mapping “light detection and ranging” (LiDAR) dataset. The data was obtained in 14 measurement runs from March to October 2017 in Hannover and consists in total of about 15 billion points. The point cloud data are first segmented by region growing and then processed by a random forest classification, which divides the segments into the five static classes (“facade”, “pole”, “fence”, “traffic sign”, and “vegetation”) and three dynamic classes (“vehicle”, “bicycle”, “person”) with an overall accuracy of 94%. All static objects are entered into a voxel grid, to compare different measurement epochs directly. In the next step, the classified voxels are combined with the result of a visibility analysis. Therefore, we use a ray tracing algorithm to detect traversed voxels and differentiate between empty space and occlusion. Each voxel is classified as suitable for the static reference map or not by its object class and its occupation state during different epochs. Thereby, we avoid to eliminate static voxels which were occluded in some of the measurement runs (e.g. parts of a building occluded by a tree). However, segments that are only temporarily present and connected to static objects, such as scaffolds or awnings on buildings, are not included in the reference map. Overall, the combination of the classification with the subsequent entry of the classes into a voxel grid provides good and useful results that can be updated by including new measurement data.
Abstract. Fully convolutional neural networks (FCN) are successfully used for pixel-wise land cover classification - the task of identifying the physical material of the Earth’s surface for every pixel in an image. The acquisition of large training datasets is challenging, especially in remote sensing, but necessary for a FCN to perform well. One way to circumvent manual labelling is the usage of existing databases, which usually contain a certain amount of label noise when combined with another data source. As a first part of this work, we investigate the impact of training data on a FCN. We experiment with different amounts of training data, varying w.r.t. the covered area, the available acquisition dates and the amount of label noise. We conclude that the more data is used for training, the better is the generalization performance of the model, and the FCN is able to mitigate the effect of label noise to a high degree. Another challenge is the imbalanced class distribution in most real-world datasets, which can cause the classifier to focus on the majority classes, leading to poor classification performance for minority classes. To tackle this problem, in this paper, we use the cosine similarity loss to force feature vectors of the same class to be close to each other in feature space. Our experiments show that the cosine loss helps to obtain more similar feature vectors, but the similarity of the cluster centers also increases.
Abstract. With the availability of large amounts of satellite image time series (SITS), the identification of different materials of the Earth’s surface is possible with a high temporal resolution. One of the basic tasks is the pixel-wise classification of land cover, i.e. the task of identifying the physical material of the Earth’s surface in an image. Fully convolutional neural networks (FCN) are successfully used for this task. In this paper, we investigate different FCN variants, using different methods for the computation of spatial, spectral, and temporal features. We investigate the impact of 3D convolutions in the spatial-temporal as well as in the spatial-spectral dimensions in comparison to 2D convolutions in the spatial dimensions only. Additionally, we introduce a new method to generate multitemporal input patches by using time intervals instead of fixed acquisition dates. We then choose the image that is closest in time to the middle of the corresponding time interval, which makes our approach more flexible with respect to the requirements for the acquisition of new data. Using these multi-temporal input patches, generated from Sentinel-2 images, we improve the classification of land cover by 4% in the mean F1-score and 1.3% in the overall accuracy compared to a classification using mono-temporal input patches. Furthermore, the usage of 3D convolutions instead of 2D convolutions improves the classification performance by a small amount of 0.4% in the mean F1-score and 1.2% in the overall accuracy.
Abstract. Pixel-wise classification of remote sensing imagery is highly interesting for tasks like land cover classification or change detection. The acquisition of large training data sets for these tasks is challenging, but necessary to obtain good results with deep learning algorithms such as convolutional neural networks (CNN). In this paper we present a method for the automatic generation of a large amount of training data by combining satellite imagery with reference data from an available geospatial database. Due to this combination of different data sources the resulting training data contain a certain amount of incorrect labels. We evaluate the influence of this so called label noise regarding the time difference between acquisition of the two data sources, the amount of training data and the class structure. We combine Sentinel-2 images with reference data from a geospatial database provided by the German Land Survey Office of Lower Saxony (LGLN). With different training sets we train a fully convolutional neural network (FCN) and classify four land cover classes (Building, Agriculture, Forest, Water). Our results show that the errors in the training samples do not have a large influence on the resulting classifiers. This is probably due to the fact that the noise is randomly distributed and thus, neighbours of incorrect samples are predominantly correct. As expected, a larger amount of training data improves the results, especially for the less well represented classes. Other influences are different illuminations conditions and seasonal effects during data acquisition. To better adapt the classifier to these different conditions they should also be included in the training data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.