2021
DOI: 10.1007/978-981-16-2543-5_10
|View full text |Cite
|
Sign up to set email alerts
|

Image Caption Generation Using Neural Network Models and LSTM Hierarchical Structure

Abstract: The caption generation is nothing but the generation of textual information from images. For this, objects from images are extracted and classified among predefined classes. The logical objects from the image are extracted and transformed into natural sentences. The recognizing process requires an iterative task that incorporates image recognition as well as machine vision. The process must define relations among objects, persons, and animals and create the textual description of these relations. The paper is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 15 publications
0
1
0
Order By: Relevance
“…It needs to contain as many scenes as possible, preferably constructed by computer vision experts. Finally, applying the fine‐grained scene graph generation model to other computer vision and multi‐media tasks [WS22, DJX*21], such as content‐based image search, image captioning, visual question answering, and multi‐modal knowledge graph construction. The finegrained scene graph generation model can provide a better representation for these scene understanding‐related tasks and can significantly improve the model performance of these tasks.…”
Section: Discussionmentioning
confidence: 99%
“…It needs to contain as many scenes as possible, preferably constructed by computer vision experts. Finally, applying the fine‐grained scene graph generation model to other computer vision and multi‐media tasks [WS22, DJX*21], such as content‐based image search, image captioning, visual question answering, and multi‐modal knowledge graph construction. The finegrained scene graph generation model can provide a better representation for these scene understanding‐related tasks and can significantly improve the model performance of these tasks.…”
Section: Discussionmentioning
confidence: 99%