2022
DOI: 10.3390/s22197624
|View full text |Cite
|
Sign up to set email alerts
|

A Residual-Inception U-Net (RIU-Net) Approach and Comparisons with U-Shaped CNN and Transformer Models for Building Segmentation from High-Resolution Satellite Images

Abstract: Building segmentation is crucial for applications extending from map production to urban planning. Nowadays, it is still a challenge due to CNNs’ inability to model global context and Transformers’ high memory need. In this study, 10 CNN and Transformer models were generated, and comparisons were realized. Alongside our proposed Residual-Inception U-Net (RIU-Net), U-Net, Residual U-Net, and Attention Residual U-Net, four CNN architectures (Inception, Inception-ResNet, Xception, and MobileNet) were implemented … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 18 publications
(8 citation statements)
references
References 60 publications
0
4
0
Order By: Relevance
“…The quality of model training was determining using the intersection over union (IoU) in relation to the intersection area and union area [18].…”
Section: Accuracy Assessmentmentioning
confidence: 99%
“…The quality of model training was determining using the intersection over union (IoU) in relation to the intersection area and union area [18].…”
Section: Accuracy Assessmentmentioning
confidence: 99%
“…Other works, such as [ 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 ], implemented what is known as Inception-like blocks, which use deep conventional networks, and the residual connections introduced in [ 26 ] to extract the outputs from each layer and concatenated them for the output, as shown in the example in Figure 4 , mimicking the operation using the Inception layer depth-wise by allowing the extracted feature at multiple receptive fields to be processed at the output layer. However, while this approach can be suitable for certain applications such as classification, a drop in the spatial features accumulates as we move deeper, diminishing the spatial accuracy of the larger LRF values, as illustrated in Figure 5 , where it can be seen that a bias towards features at the centre starts to increase, impairing the capability of the layer to accurately position where the feature is located and decreasing its efficiency in applications such as object detection.…”
Section: Width-based Layer Design (Inception and Inception-like Appro...mentioning
confidence: 99%
“…Fully Convolutional Networks (FCNs) are a class of deep learning models without classi cation level used to identify local regions and pixel level categories in images [55]. The emergence of FCNs marks the beginning of the transition from convolutional neural networks (CNN) to fully connected layers.…”
Section: Fcn Modelmentioning
confidence: 99%