2023
DOI: 10.3390/electronics12071547
|View full text |Cite
|
Sign up to set email alerts
|

DFEN: Dual Feature Enhancement Network for Remote Sensing Image Caption

Abstract: The remote sensing image caption can acquire ground objects and the semantic relationships between different ground objects. Existing remote sensing image caption algorithms do not acquire enough ground object information from remote-sensing images, resulting in inaccurate captions. As a result, this paper proposes a codec-based Dual Feature Enhancement Network (“DFEN”) to enhance ground object information from both image and text levels. We build the Image-Enhancement module at the image level using the multi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 27 publications
0
1
0
Order By: Relevance
“…Phrase comprehension (PC) is a fundamental task in the multi-modal learning community and serves as the basis for many downstream tasks, including image captioning [1,2], visual question answering [3,4], etc. The purpose of PC is to locate a specific entity in an image according to a given linguistic query.…”
Section: Introductionmentioning
confidence: 99%
“…Phrase comprehension (PC) is a fundamental task in the multi-modal learning community and serves as the basis for many downstream tasks, including image captioning [1,2], visual question answering [3,4], etc. The purpose of PC is to locate a specific entity in an image according to a given linguistic query.…”
Section: Introductionmentioning
confidence: 99%