2020
DOI: 10.1109/jstars.2020.3013818
|View full text |Cite
|
Sign up to set email alerts
|

Toward Remote Sensing Image Retrieval Under a Deep Image Captioning Perspective

Abstract: The performance of remote sensing image retrieval (RSIR) systems depends on the capability of the extracted features in characterizing the semantic content of images. Existing RSIR systems describe images by visual descriptors that model the primitives (such as different land-cover classes) present in the images. However, the visual descriptors may not be sufficient to describe the high-level complex content of RS images (e.g., attributes and relationships among different land-cover classes). To address this i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
21
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 58 publications
(21 citation statements)
references
References 41 publications
0
21
0
Order By: Relevance
“…Although the retrieval method of generating caption solves human resources annotation, it may still leave some retrieval shortcomings. On the one hand, two-stage retrieval mode makes it difficult to avoid the loss of abundant information in the middle stage [13]. On the other hand, coarse caption generated by machine may not be used as a good representation of the RS image [14].…”
mentioning
confidence: 99%
“…Although the retrieval method of generating caption solves human resources annotation, it may still leave some retrieval shortcomings. On the one hand, two-stage retrieval mode makes it difficult to avoid the loss of abundant information in the middle stage [13]. On the other hand, coarse caption generated by machine may not be used as a good representation of the RS image [14].…”
mentioning
confidence: 99%
“…The second module is devoted to semantic analysis. The semantic analysis module's primary function is to provide a higher-level knowledge of the given scenery and to annotate the given image with various class labels [14,15]. The following module is for intelligent feature extraction.…”
Section: State Of the Artmentioning
confidence: 99%
“…This method is capable of generating proper captions, but it is not able to create semantically correct and imagespecific captions. In contrast to this, the novel image caption generation approach produces image caption from the visual contents using language models and then examines the visual content of an image [6]. In comparison with the above two classes, novel caption generation could create proper captions for the provided image, which is semantically correct when compared to earlier methods [7].…”
Section: Introductionmentioning
confidence: 99%