Proceedings of the 1st Workshop on Multilingual Representation Learning 2021
DOI: 10.18653/v1/2021.mrl-1.13
|View full text |Cite
|
Sign up to set email alerts
|

VisualSem: a high-quality knowledge graph for vision and language

Abstract: An exciting frontier in natural language understanding (NLU) and generation (NLG) calls for (vision-and-) language models that can efficiently access external structured knowledge repositories. However, many existing knowledge bases only cover limited domains, or suffer from noisy data, and most of all are typically hard to integrate into neural language pipelines. To fill this gap, we release VisualSem: a high-quality knowledge graph (KG) which includes nodes with multilingual glosses, multiple illustrative … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 31 publications
0
5
0
1
Order By: Relevance
“…Besides, by merging the relations with similar semantics into the unified commonsense relation set, these relations have a large semantic distance from each other and are stable for the task of visual explicit relational inference. Besides, commonsense KGs are also widely applied for crossmodal structured knowledge retrieval on multi-modal tasks, for example, SGG [40], Visual Commonsense Reasoning [41], Image Captioning [42], and Multimodal Retrieval [43].…”
Section: Knowledge Graphmentioning
confidence: 99%
“…Besides, by merging the relations with similar semantics into the unified commonsense relation set, these relations have a large semantic distance from each other and are stable for the task of visual explicit relational inference. Besides, commonsense KGs are also widely applied for crossmodal structured knowledge retrieval on multi-modal tasks, for example, SGG [40], Visual Commonsense Reasoning [41], Image Captioning [42], and Multimodal Retrieval [43].…”
Section: Knowledge Graphmentioning
confidence: 99%
“…Nous envisageons de modéliser explicitement le lien entre les entités nommées et les images les représentant, en les considérant toutes deux comme des noeuds du graphe de connaissance, en exploitant par exemple les données structurées de Wikidata et Wikimédia Commons. Quelques travaux explorent des approches similaires mais sont limités à de petits graphes de connaissance et une évaluation intrinsèque des représentations en complétant le même graphe (Xie et al, 2017;Pezeshkpour et al, 2018;Wilcke et al, 2020;Alberts et al, 2021).…”
Section: Liens Entre Les Entitésunclassified
“…We use Multi Modal Knowledge Graph VisualSem (Alberts et al, 2020) to do text entity extraction and grounding to KG for the target complex word. For entity extraction, CLIP textual embedding (Radford et al, 2021) were used as defined in original paper.…”
Section: Knowledge Graphmentioning
confidence: 99%