Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475586
|View full text |Cite
|
Sign up to set email alerts
|

Graph Neural Networks for Knowledge Enhanced Visual Representation of Paintings

Abstract: We propose ArtSAGENet, a novel multimodal architecture that integrates Graph Neural Networks (GNNs) and Convolutional Neural Networks (CNNs), to jointly learn visual and semantic-based artistic representations. First, we illustrate the significant advantages of multi-task learning for fine art analysis and argue that it is conceptually a much more appropriate setting in the fine art domain than the single-task alternatives. We further demonstrate that several GNN architectures can outperform strong CNN baselin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 52 publications
(53 reference statements)
0
4
0
Order By: Relevance
“…Their method relies on generating synthetic images generated with Stable Diffusion and computing CLIP [24] image embeddings to obtain content and style embeddings of paintings in content and style spaces. Efthymiou et al [8] proposed the multimodal architecture, which consists of Graph Neural Networks and Convolutional Neural Networks. These networks were jointly trained on visual and semantic artistic representations.…”
Section: Art Retrievalmentioning
confidence: 99%
“…Their method relies on generating synthetic images generated with Stable Diffusion and computing CLIP [24] image embeddings to obtain content and style embeddings of paintings in content and style spaces. Efthymiou et al [8] proposed the multimodal architecture, which consists of Graph Neural Networks and Convolutional Neural Networks. These networks were jointly trained on visual and semantic artistic representations.…”
Section: Art Retrievalmentioning
confidence: 99%
“…The usability of feature vectors learned in the context of image classification to serve as descriptors for image retrieval has already been investigated [28,[30][31][32]. Even leveraging the softmax layer activations for image retrieval seems to be possible [33].…”
Section: Auxiliary Lossesmentioning
confidence: 99%
“…In this sense, we find revealing and academically documented works such as Graph Neural Networks for Knowledge Enhanced Visual Representation of Paintings (Efthymiou et al, 2021), an example of multimodal architecture that integrates Graph Neural Networks (GNNs) and Convolutional Neural Networks (CNNs), to weave a framework of meanings between visual and semantic artistic…”
Section: Introductionmentioning
confidence: 98%