2022
DOI: 10.1109/tip.2022.3162964
|View full text |Cite
|
Sign up to set email alerts
|

Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
17
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 106 publications
(17 citation statements)
references
References 45 publications
0
17
0
Order By: Relevance
“…Xue et al in [37] proposed deep hierarchical ViT (DHViT) for the hyperspectral and light detection and ranging (LiDAR) data classification. The authors used the Trento, Houston 2013, and Houston 2018 datasets and obtained accuracies of 99.58%, 99.55%, and 96.40%, respectively.…”
Section: Vits For Image Classificationmentioning
confidence: 99%
“…Xue et al in [37] proposed deep hierarchical ViT (DHViT) for the hyperspectral and light detection and ranging (LiDAR) data classification. The authors used the Trento, Houston 2013, and Houston 2018 datasets and obtained accuracies of 99.58%, 99.55%, and 96.40%, respectively.…”
Section: Vits For Image Classificationmentioning
confidence: 99%
“…The crux of multi-source data lies in devising strategies to address the collaborative processing of multi-source features, enabling harmonious representation across disparate datasets. Xue Z et al proposed the network architecture DHViT [13], which incorporates hyperspectral images and Lidar data, and introduced a cross-attention feature fusion module to enhance collaborative classification performance. Additionally, MIFNet [14] presented a global dependence fusion module (GDFM) for feature fusion across diverse data sources.…”
Section: Introductionmentioning
confidence: 99%
“…Subsequently, to further exploit high-level semantic features, convolutional neural networks (CNNs) are employed for multisource data classification [13]. Encoder-decoder network [14], coupled CNNs [15], Gabor CNN [16], cross attention [17], and Transformer [18] are used to extract representative multisource features, and these methods have achieved promising performance.…”
Section: Introductionmentioning
confidence: 99%