U-Net Ensemble for Semantic and Height Estimation Using Coarse-Map Initialization

Kunwar, Saket

doi:10.1109/igarss.2019.8899861

Cited by 14 publications

(10 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(2) To improve prediction of taller building heights, we propose a novel strategy for fast augmentations to synthetically increase the heights of objects by inverting geocentric pose vector fields. (3) We outperform state of the art for height prediction [38,29,27,19,43] and geocentric pose [6] and demonstrate accurate predictions even for orthorectified images that violate our affine assumptions. (4) We present the first demonstration of supervising this task without lidar, using only geometry derived from images that can be produced anywhere on Earth.…”

Section: Introductionmentioning

confidence: 87%

“…Other features such as trees and buildings have known distributions of physically plausible heights. Kunwar [19] and Zheng et al [43] leveraged semantic cues as priors for height prediction to win the 2019 Data Fusion Contest (DFC19) singleview semantic 3D challenge track [20]. Srivistava et al [38] proposed to learn semantics and height jointly with a multi-task deep network.…”

Section: Monocular Height Predictionmentioning

confidence: 99%

“…For fair compari- son with [27,29,38,6], we retrained and tested our models and [6] using the custom DFC19 train and test sets used in [27]. For comparison with [19,43,6], we tested against the DFC19 test set used for [20]. Results are reported in Tab.…”

Section: State Of the Art For Geocentric Posementioning

confidence: 99%

See 2 more Smart Citations

Single View Geocentric Pose in the Wild

Christie¹,

Foster²,

Hagstrom³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 87%

Section: Monocular Height Predictionmentioning

confidence: 99%

See 1 more Smart Citation

Single View Geocentric Pose in the Wild

Christie¹,

Foster²,

Hagstrom³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Besides, Conditional generative adversarial network (cGAN) [15] was proposed to frame height estimation as an image translation task. Kunwar et al [16] exploited semantic labels as priors to enhance the performance of height estimation on the large-scale Urban Semantic 3D (US3D) dataset [17]. Xiong et al [18] designed and constructed a large-scale benchmark dataset for cross-dataset transfer learning on the height estimation task, which includes a large-scale synthetic dataset and several real-world datasets.…”

Section: Monocular Height Estimationmentioning

confidence: 99%

“…Objects of different semantic types usually have distinct height attributes. Thus, the geometric information in height maps should have high correlation with the semantic information [16]. Inspired by this, we choose to examine the learned interior activations to find human-understandable representations.…”

Section: Unit-level Interpretation Of Mhe Modelsmentioning

confidence: 99%

Disentangled Latent Transformer for Interpretable Monocular Height Estimation

Xiong¹,

Chen²,

Shi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Monocular height estimation (MHE) from remote sensing imagery has high potential in generating 3D city models efficiently for a quick response to natural disasters. Most existing works pursue higher performance. However, there is little research exploring the interpretability of MHE networks. In this paper, we target at exploring how deep neural networks predict height from a single monocular image. Towards a comprehensive understanding of MHE networks, we propose to interpret them from multiple levels: 1) Neurons: unit-level dissection. Exploring the semantic and height selectivity of the learned internal deep representations; 2) Instances: object-level interpretation. Studying the effects of different semantic classes, scales and spatial contexts on height estimation; 3) Attribution: pixel-level analysis. Understanding which input pixels are important for the height estimation. Based on the multi-level interpretation, a disentangled latent Transformer network is proposed towards a more compact, reliable and explainable deep model for monocular height estimation. Furthermore, a novel unsupervised semantic segmentation task based on height estimation is first introduced in this work. Additionally, we also construct a new dataset for joint semantic segmentation and height estimation. Our work provides novel insights for both understanding and designing MHE models. The dataset and code are publicly available at https://github.com/ShadowXZT/DLT-Height-Estimation.pytorch.

show abstract