2023
DOI: 10.1109/jstars.2023.3261361
|View full text |Cite
|
Sign up to set email alerts
|

Visual Question Generation From Remote Sensing Images

Abstract: Visual question generation (VQG) is a fundamental task in vision-language understanding that aims to generate relevant questions about the given input image. In this paper, we propose a paragraph-based VQG approach for generating intelligent questions in natural language about remote sensing (RS) images. Specifically, our proposed framework consists of two transformer-based vision and language models. First, we employ a swin-transformer encoder to generate a multi-scale representative visual feature from the i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 52 publications
0
6
0
Order By: Relevance
“…From what we know, ref. [26] is the only work in the literature exploring generative-based visual question generation for RS images. In [26] the authors propose a transformer-based architecture to generate different plausible questions for RS images.…”
Section: Rs Visual Question Generation (Rsvqg)mentioning
confidence: 99%
See 1 more Smart Citation
“…From what we know, ref. [26] is the only work in the literature exploring generative-based visual question generation for RS images. In [26] the authors propose a transformer-based architecture to generate different plausible questions for RS images.…”
Section: Rs Visual Question Generation (Rsvqg)mentioning
confidence: 99%
“…[26] is the only work in the literature exploring generative-based visual question generation for RS images. In [26] the authors propose a transformer-based architecture to generate different plausible questions for RS images. They use questions from the dataset proposed in [15] to train the model in a fully supervised fashion.…”
Section: Rs Visual Question Generation (Rsvqg)mentioning
confidence: 99%
“…Since ChatGPT was released, the scientific community has been using it, and articles have been published on it [107] on several topics [106,[110][111][112][113][114][115][116][117][118][119][120], as well as ethical issues that have arisen very recently [121][122][123][124][125][126]. To date, there are few studies that demonstrate the usefulness of GPT or derived tools (e.g., Visual ChatGPT) in the field of RS and satellite image classification [127][128][129][130]. A useful tool made available by the world community via the web (e.g., GitHub or several Google extensions) is the possibility of being able to use prompts (i.e., texts explaining to ChatGPT what to do) that are already pre-compiled so as to (i) save time and (ii) prevent the system from being trained wrongly or giving wrong answers.…”
Section: Openai Chatgpt-3mentioning
confidence: 99%
“…Yuan et al [232] introduce a self-paced curriculum learning approach for VQA in remote sensing. Models like BERT [231,235], CLIP [233], and GPT [237] have been widely applied, and addressing the open-set problem of VQA in remote sensing is also a focus of research [234]. The VQA task related to change detection is an emerging research direction [236].…”
Section: Understanding Taskmentioning
confidence: 99%