Arbitrary-Oriented Vehicle Detection in Aerial Imagery with Single Convolutional Neural Networks

Tang, Tianyu; Zhou, Shilin; Deng, Zhipeng; Lei, Lin; Zou, Huanxin

doi:10.3390/rs9111170

Cited by 121 publications

(61 citation statements)

References 32 publications

(55 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Tianyu Tang et al in their article [22] they propose using a convolutional neural network for the direct generation of randomly oriented detection results. Their approach, called Oriented_SSD (Single Shot MultiBox Detector, SSD), uses a set of default blocks with different scales at each location on the object map to create bounding detection blocks.…”

Section: Literature Reviewmentioning

confidence: 99%

The Capacity of the Road Network: Data Collection and Statistical Analysis of Traffic Characteristics

Shepelev

Aliukov²,

Nikolskaya³

et al. 2020

Energies

View full text Add to dashboard Cite

The possibilities of collecting the necessary information using multi-touch cameras and ways to improve road traffic data collection are considered. An increase in the number of vehicles leads to traffic jams, which in turn leads to an increase in travel time, additional fuel consumption and other negative consequences. To solve this problem, it is necessary to have a reliable information collection system and apply modern effective methods of processing the collected information. The technology considered in the article allows taking into account pedestrians crossing the intersection. The purpose of this article is to determine the most important traffic characteristics that affect the traffic capacity of the intersection, in other words, the actual number of passing cars. Throughput is taken as a dependent variable. Based on the results of the regression analysis, a model was developed to predict the intersection throughput taking into account the most important traffic characteristics. Besides, this model is based on the fuzzy logic method and using the Fuzzy TECH 5.81d Professional Edition computer program.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

The Capacity of the Road Network: Data Collection and Statistical Analysis of Traffic Characteristics

Shepelev

Aliukov²,

Nikolskaya³

et al. 2020

Energies

View full text Add to dashboard Cite

show abstract

“…High-resolution remote sensing images have been increasingly popular and widely used in many geoscience applications, including automatic mapping of land use or land cover types, and automatic detection or extraction of small objects such as vehicles, ships, trees, roads, buildings, etc. [1][2][3][4][5][6]. As one of these geoscience applications, the automatic extraction of building footprints from high-resolution imagery is beneficial for urban planning, disaster management, and environmental management [7][8][9][10].…”

Section: Introductionmentioning

confidence: 99%

“…(3) The Inria dataset [41] (used in References [36,37]) contains aerial images covering 10 regions in the USA and Austria (at 30 cm resolution, with RGB bands). (4) The WHU (Wuhan University) building dataset [42] (used in Reference [38]) includes an aerial dataset containing 8189 image patches (at 30 cm resolution, with RGB bands, each with a size of 512 × 512 pixels) and a satellite dataset containing 17,388 image patches (at 270 cm resolution, with the same bands and size as the aerial dataset). (5) The AIRS (Aerial Imagery for Roof Segmentation) dataset [43] contains aerial images covering the area of Christchurch city in New Zealand (at 7.5 cm resolution, with RGB bands).…”

Section: Introductionmentioning

confidence: 99%

Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data

Fang

et al. 2019

Remote Sensing

189

View full text Add to dashboard Cite

Automatic extraction of building footprints from high-resolution satellite imagery has become an important and challenging research issue receiving greater attention. Many recent studies have explored different deep learning-based semantic segmentation methods for improving the accuracy of building extraction. Although they record substantial land cover and land use information (e.g., buildings, roads, water, etc.), public geographic information system (GIS) map datasets have rarely been utilized to improve building extraction results in existing studies. In this research, we propose a U-Net-based semantic segmentation method for the extraction of building footprints from high-resolution multispectral satellite images using the SpaceNet building dataset provided in the DeepGlobe Satellite Challenge of IEEE Conference on Computer Vision and Pattern Recognition 2018 (CVPR 2018). We explore the potential of multiple public GIS map datasets (OpenStreetMap, Google Maps, and MapWorld) through integration with the WorldView-3 satellite datasets in four cities (Las Vegas, Paris, Shanghai, and Khartoum). Several strategies are designed and combined with the U-Net–based semantic segmentation model, including data augmentation, post-processing, and integration of the GIS map data and satellite images. The proposed method achieves a total F1-score of 0.704, which is an improvement of 1.1% to 12.5% compared with the top three solutions in the SpaceNet Building Detection Competition and 3.0% to 9.2% compared with the standard U-Net–based method. Moreover, the effect of each proposed strategy and the possible reasons for the building footprint extraction results are analyzed substantially considering the actual situation of the four cities.

show abstract

“…To improve the computation efficiency and the effect of small object detection, Chen et al [29] incorporated the semantic segmentation and global activation information into the SSD framework for object detection in RSIs. The other works examining ODRSIs based on one-stage methods include Tang et al [30], Tayara et al [31], and Chen et al [32].…”

Section: Introductionmentioning

confidence: 99%

Object Detection in Remote Sensing Images Based on Improved Bounding Box Regression and Multi-Level Features Fusion

et al. 2020

View full text Add to dashboard Cite

The objective of detection in remote sensing images is to determine the location and category of all targets in these images. The anchor based methods are the most prevalent deep learning based methods, and still have some problems that need to be addressed. First, the existing metric (i.e., intersection over union (IoU)) could not measure the distance between two bounding boxes when they are nonoverlapping. Second, the exsiting bounding box regression loss could not directly optimize the metric in the training process. Third, the existing methods which adopt a hierarchical deep network only choose a single level feature layer for the feature extraction of region proposals, meaning they do not take full use of the advantage of multi-level features. To resolve the above problems, a novel object detection method for remote sensing images based on improved bounding box regression and multi-level features fusion is proposed in this paper. First, a new metric named generalized IoU is applied, which can quantify the distance between two bounding boxes, regardless of whether they are overlapping or not. Second, a novel bounding box regression loss is proposed, which can not only optimize the new metric (i.e., generalized IoU) directly but also overcome the problem that existing bounding box regression loss based on the new metric cannot adaptively change the gradient based on the metric value. Finally, a multi-level features fusion module is proposed and incorporated into the existing hierarchical deep network, which can make full use of the multi-level features for each region proposal. The quantitative comparisons between the proposed method and baseline method on the large scale dataset DIOR demonstrate that incorporating the proposed bounding box regression loss, multi-level features fusion module, and a combination of both into the baseline method can obtain an absolute gain of 0.7%, 1.4%, and 2.2% or so in terms of mAP, respectively. Comparing this with the state-of-the-art methods demonstrates that the proposed method has achieved a state-of-the-art performance. The curves of average precision with different thresholds show that the advantage of the proposed method is more evident when the threshold of generalized IoU (or IoU) is relatively high, which means that the proposed method can improve the precision of object localization. Similar conclusions can be obtained on a NWPU VHR-10 dataset.

show abstract

Arbitrary-Oriented Vehicle Detection in Aerial Imagery with Single Convolutional Neural Networks

Cited by 121 publications

References 32 publications

The Capacity of the Road Network: Data Collection and Statistical Analysis of Traffic Characteristics

The Capacity of the Road Network: Data Collection and Statistical Analysis of Traffic Characteristics

Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data

Object Detection in Remote Sensing Images Based on Improved Bounding Box Regression and Multi-Level Features Fusion

Contact Info

Product

Resources

About