2022
DOI: 10.1109/access.2022.3151874
|View full text |Cite
|
Sign up to set email alerts
|

Using Neural Encoder-Decoder Models With Continuous Outputs for Remote Sensing Image Captioning

Abstract: This research was supported through COST Action Multi3Generation Ref. CA18231, and also through Fundação para a Ciência e Tecnologia (FCT), namely through the FCT project grant with reference PTDC/CCI-CIF/32607/2017 (MIMU) and through the Ph.D. scholarship with reference 2020.06106.BD, as well as through the INESC-ID multi-annual funding from the PIDDAC programme with reference UIDB/50021/2020. We also gratefully acknowledge the support of NVIDIA Corporation, with the donation of the two Titan Xp GPUs used in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(4 citation statements)
references
References 36 publications
0
4
0
Order By: Relevance
“…Language integration in RS has showcased impressive capabilities across various tasks, including image captioning [2,[17][18][19][20][21][22][23][24][25][26][27][28], VQA [3,[29][30][31][32], and text-image retrieval [4]. A comprehensive review of NLP applications in RS can be found at [1].…”
Section: Nlp In Remote Sensingmentioning
confidence: 99%
See 1 more Smart Citation
“…Language integration in RS has showcased impressive capabilities across various tasks, including image captioning [2,[17][18][19][20][21][22][23][24][25][26][27][28], VQA [3,[29][30][31][32], and text-image retrieval [4]. A comprehensive review of NLP applications in RS can be found at [1].…”
Section: Nlp In Remote Sensingmentioning
confidence: 99%
“…Other advancements in image captioning include models that summarize multiple captions into one during training [21]. Ramos et al [22] used continuous word vector representations in the decoder instead of discrete representations. Hoxha et al [23] employed a decoder based on multiple Support Vector Machines (SVMs) to alleviate overfitting.…”
Section: Nlp In Remote Sensingmentioning
confidence: 99%
“…In terms of the other branch, we have that the use of subsymbolic AI approaches such as deep neural networks, to solve geospatial problems, is also a common component of GeoAI research. Although some existing deep learning architectures for tasks, such as image classification, image segmentation, question answering, modeling language and vision, or entity recognition, can be readily used for GeoAI tasks such as classification and object detection in remote sensing images (Bastani et al, 2022; Camps‐Valls et al, 2021), land use classification (Camps‐Valls et al, 2021), geographic question answering and question answering over Earth observation products (Coelho et al, 2021; Silva et al, 2022), remote sensing image captioning (Ramos & Martins, 2022), or place name recognition and resolution (Cardoso et al, 2021; Kulkarni et al, 2021; Liu et al, 2022), some unique challenges emerge which require special model designs, training objectives, or data pre‐processing techniques, for instance by incorporating spatial principles and spatial inductive biases. We call this kind of practices spatially explicit machine learning (Janowicz et al, 2020; Li et al, 2021; Mai, Janowicz, Yan, et al, 2020; Mai, Jiang, et al, 2022; Yan et al, 2017, 2019; Zhu, Janowicz, Cai, & Mai, 2022).…”
Section: Symbolic and Subsymbolic Geoaimentioning
confidence: 99%
“…An RSI captioning challenge has attracted a lot of attention [4]. The captioning work must be used for a variety of beneficial potential applications, including image retrieval [5][6]. More semantic details about an RSI may be available through the automatic caption generation.…”
Section: Introductionmentioning
confidence: 99%