IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium 2019
DOI: 10.1109/igarss.2019.8900503
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Scale Cropping Mechanism for Remote Sensing Image Captioning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 38 publications
(12 citation statements)
references
References 8 publications
0
12
0
Order By: Relevance
“…Besides tailored attention mechanisms, other previous studies on remote sensing image captioning have explored alternative paths for improving the results of standard encoderdecoder neural models. These include studies (a) exploring multi-scale feature representations [15], [16]; (b) using novel loss functions [17] or training procedures based on reinforcement learning [18], improving on the standard cross-entropy loss [17]; (c) extending and combining the set of reference captions, associated to each image, through summarization [19] or retrieval [20] approaches; or (d) using decoder components based on the Transformer architecture [18].…”
Section: Related Workmentioning
confidence: 99%
“…Besides tailored attention mechanisms, other previous studies on remote sensing image captioning have explored alternative paths for improving the results of standard encoderdecoder neural models. These include studies (a) exploring multi-scale feature representations [15], [16]; (b) using novel loss functions [17] or training procedures based on reinforcement learning [18], improving on the standard cross-entropy loss [17]; (c) extending and combining the set of reference captions, associated to each image, through summarization [19] or retrieval [20] approaches; or (d) using decoder components based on the Transformer architecture [18].…”
Section: Related Workmentioning
confidence: 99%
“…Also, because the training 6 Complexity labels which are texts are different from the features obtained from the images, language model techniques are required to analyze the form, meaning, and context of a sequence of words. is becomes even more complex as keywords are required to be identified for emphasizing the action or scene being described [117].…”
Section: Image Captioningmentioning
confidence: 99%
“…Recently, to realize the end-to-end identification of remote sensing objects and their spatial relationships, a multi-scale remote sensing image interpretation network based on a FCN, a U-Net and a LSTM is proposed [4]. In addition, Zhang et al [49] propose a training mechanism of multi-scale cropping for remote sensing image captioning based on an encoder-decoder model to improve the generalization of feature representation. Although image captioning has been applied to the semantic understanding of remote sensing images in recent years, there are still many problems to be solved.…”
Section: Remote Sensing Image Captioningmentioning
confidence: 99%