2023
DOI: 10.1109/access.2022.3232508
|View full text |Cite
|
Sign up to set email alerts
|

Image Caption Generation Using Contextual Information Fusion With Bi-LSTM-s

Abstract: The image caption generation algorithm necessitates the expression of image content using accurate natural language. Given the existing encoder-decoder algorithm structure, the decoder solely generates words one by one in a front-to-back order and is unable to analyze integral contextual information. This paper employs a Bi-LSTM (Bi-directional Long Short-Term Memory) structure, which not only draws on past information but also captures subsequent information, resulting in the prediction of image content subje… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 20 publications
(5 citation statements)
references
References 44 publications
0
5
0
Order By: Relevance
“…Recognizing aerial images is an indispensable application in deep neural networks [37,38,39,40,41,. We proposed a novel LR aerial photo categorization pipeline, wherein deep perceptual features are extracted and refined by propagating the prior knowledge of HR aerial photos into LR ones.…”
Section: Discussionmentioning
confidence: 99%
“…Recognizing aerial images is an indispensable application in deep neural networks [37,38,39,40,41,. We proposed a novel LR aerial photo categorization pipeline, wherein deep perceptual features are extracted and refined by propagating the prior knowledge of HR aerial photos into LR ones.…”
Section: Discussionmentioning
confidence: 99%
“…Recognizing aerial images is an indispensable application in remote sensing [21][22][23][24][25]. We proposed a novel crossresolution-enhanced high-resolution aerial photo categorization pipeline, wherein deep perceptual features are extracted and refined by propagating the prior knowledge of low-resolution aerial photos into high-resolution ones.…”
Section: Discussionmentioning
confidence: 99%
“…Traditional LSTM models, however, are only capable of capturing sequential information in one direction. In music, the selection of each musical note depends not only on preceding musical segments but also exhibits significant associations with subsequent musical segments [10] . Consequently, relying solely on unidirectional LSTM proves challenging in generating high-quality music compositions.…”
Section: Bi-lstmmentioning
confidence: 99%