Efficient knowledge distillation for remote sensing image classification: a CNN-based approach

Song, Huaxiang; Wei, Chai; Yong, Zhou

doi:10.1108/ijwis-10-2023-0192

IJWIS

2023

DOI: 10.1108/ijwis-10-2023-0192

|View full text |Cite

Efficient knowledge distillation for remote sensing image classification: a CNN-based approach

Huaxiang Song,

Chai Wei,

Zhou Yong

Abstract: Purpose The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of clustered ground objects and noisy backgrounds. Recent research typically leverages larger volume models to achieve advanced performance. However, the operating environments of remote sensing commonly cannot provide unconstrained computational and storage resources. It requires lightweight algorithms with exceptional generaliz… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 63 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

QAGA-Net: enhanced vision transformer-based object detection for remote sensing images

Song,

Xia,

Wang

et al. 2024

IJICC

View full text Add to dashboard Cite

PurposeVision transformers (ViT) detectors excel in processing natural images. However, when processing remote sensing images (RSIs), ViT methods generally exhibit inferior accuracy compared to approaches based on convolutional neural networks (CNNs). Recently, researchers have proposed various structural optimization strategies to enhance the performance of ViT detectors, but the progress has been insignificant. We contend that the frequent scarcity of RSI samples is the primary cause of this problem, and model modifications alone cannot solve it.Design/methodology/approachTo address this, we introduce a faster RCNN-based approach, termed QAGA-Net, which significantly enhances the performance of ViT detectors in RSI recognition. Initially, we propose a novel quantitative augmentation learning (QAL) strategy to address the sparse data distribution in RSIs. This strategy is integrated as the QAL module, a plug-and-play component active exclusively during the model’s training phase. Subsequently, we enhanced the feature pyramid network (FPN) by introducing two efficient modules: a global attention (GA) module to model long-range feature dependencies and enhance multi-scale information fusion, and an efficient pooling (EP) module to optimize the model’s capability to understand both high and low frequency information. Importantly, QAGA-Net has a compact model size and achieves a balance between computational efficiency and accuracy.FindingsWe verified the performance of QAGA-Net by using two different efficient ViT models as the detector’s backbone. Extensive experiments on the NWPU-10 and DIOR20 datasets demonstrate that QAGA-Net achieves superior accuracy compared to 23 other ViT or CNN methods in the literature. Specifically, QAGA-Net shows an increase in mAP by 2.1% or 2.6% on the challenging DIOR20 dataset when compared to the top-ranked CNN or ViT detectors, respectively.Originality/valueThis paper highlights the impact of sparse data distribution on ViT detection performance. To address this, we introduce a fundamentally data-driven approach: the QAL module. Additionally, we introduced two efficient modules to enhance the performance of FPN. More importantly, our strategy has the potential to collaborate with other ViT detectors, as the proposed method does not require any structural modifications to the ViT backbone.

show abstract

QAGA-Net: enhanced vision transformer-based object detection for remote sensing images

Song,

Xia,

Wang

et al. 2024

IJICC

View full text Add to dashboard Cite

show abstract

A text classification method combining in-domain pre-training and prompt learning for the steel e-commerce industry

Peng,

Luo,

Yuan

et al. 2024

IJWIS

View full text Add to dashboard Cite

Purpose With the development of Web information systems, steel e-commerce platforms have accumulated a large number of quality objection texts. These texts reflect consumer dissatisfaction with the dimensions, appearance and performance of steel products, providing valuable insights for product improvement and consumer decision-making. Currently, mainstream solutions rely on pre-trained models, but their performance on domain-specific data sets and few-shot data sets is not satisfactory. This paper aims to address these challenges by proposing more effective methods for improving model performance on these specialized data sets. Design/methodology/approach This paper presents a method on the basis of in-domain pre-training, bidirectional encoder representation from Transformers (BERT) and prompt learning. Specifically, a domain-specific unsupervised data set is introduced into the BERT model for in-domain pre-training, enabling the model to better understand specific language patterns in the steel e-commerce industry, enhancing the model’s generalization capability; the incorporation of prompt learning into the BERT model enhances attention to sentence context, improving classification performance on few-shot data sets. Findings Through experimental evaluation, this method demonstrates superior performance on the quality objection data set, achieving a Macro-F1 score of 93.32%. Additionally, ablation experiments further validate the significant advantages of in-domain pre-training and prompt learning in enhancing model performance. Originality/value This study clearly demonstrates the value of the new method in improving the classification of quality objection texts for steel products. The findings of this study offer practical insights for product improvement in the steel industry and provide new directions for future research on few-shot learning and domain-specific models, with potential applications in other fields.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Efficient knowledge distillation for remote sensing image classification: a CNN-based approach

Cited by 2 publications

References 63 publications

QAGA-Net: enhanced vision transformer-based object detection for remote sensing images

QAGA-Net: enhanced vision transformer-based object detection for remote sensing images

A text classification method combining in-domain pre-training and prompt learning for the steel e-commerce industry

Contact Info

Product

Resources

About