2023
DOI: 10.3390/app13127115
|View full text |Cite
|
Sign up to set email alerts
|

EDET: Entity Descriptor Encoder of Transformer for Multi-Modal Knowledge Graph in Scene Parsing

Abstract: In scene parsing, the model is required to be able to process complex multi-modal data such as images and contexts in real scenes, and discover their implicit connections from objects existing in the scene. As a storage method that contains entity information and the relationship between entities, a knowledge graph can well express objects and the semantic relationship between objects in the scene. In this paper, a new multi-phase process was proposed to solve scene parsing tasks; first, a knowledge graph was … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 25 publications
0
1
0
Order By: Relevance
“…However, the region features a lack of relationship information. Some researchers [21][22][23] adopt the scene graph [24] to express objects and the semantic relationship between objects. Scene graphs bring richer information, but richer information contains more redundancy information, and the scene graph must be processed by graph convolution.…”
Section: Image Captioningmentioning
confidence: 99%
“…However, the region features a lack of relationship information. Some researchers [21][22][23] adopt the scene graph [24] to express objects and the semantic relationship between objects. Scene graphs bring richer information, but richer information contains more redundancy information, and the scene graph must be processed by graph convolution.…”
Section: Image Captioningmentioning
confidence: 99%