2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00101
|View full text |Cite
|
Sign up to set email alerts
|

Learning Graph Embeddings for Compositional Zero-shot Learning

Abstract: In compositional zero-shot learning, the goal is to recognize unseen compositions (e.g. old dog) of observed visual primitives states (e.g. old, cute) and objects (e.g. car, dog) in the training set. This is challenging because the same state can for example alter the visual appearance of a dog drastically differently from a car. As a solution, we propose a novel graph formulation called Compositional Graph Embedding (CGE) that learns image features, compositional classifiers and latent representations of visu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

3
68
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 102 publications
(71 citation statements)
references
References 45 publications
3
68
0
Order By: Relevance
“…MIT-States instead contains images collected through older search engine with limited human annotation leading to significant label noise [35]. To address the limitations of these two datasets, in our previous work [5] we introduced a split built on top of Stanford GQA dataset [59], i.e. the Compositional GQA (C-GQA) dataset.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…MIT-States instead contains images collected through older search engine with limited human annotation leading to significant label noise [35]. To address the limitations of these two datasets, in our previous work [5] we introduced a split built on top of Stanford GQA dataset [59], i.e. the Compositional GQA (C-GQA) dataset.…”
Section: Methodsmentioning
confidence: 99%
“…We compare with four state-of-the-art methods, Attribute as Operators (AOP) [2], considering objects as vectors and states as matrices modifying them [2]; LabelEm-bed+ (LE+) [1], [2] training a classifier merging state and object embeddings with an MLP; Task-Modular Neural Networks (TMN) [3], modifying the classifier through a gating function receiving as input the queried state-object composition; and SymNet [4], learning object embeddings showing symmetry under different state-based transformations. We also compare Co-CGE with our previous works, CGE [5] and CompCos [10]. We train each model with their default hyperparameters, reporting the closed and open world results of the models with the best AUC on the validation set.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations