2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00305
|View full text |Cite
|
Sign up to set email alerts
|

Layout-Graph Reasoning for Fashion Landmark Detection

Abstract: Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this paper, we propose to seamlessly enforce structural layout relationships among landmarks on the intermediate represent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
19
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(19 citation statements)
references
References 30 publications
0
19
0
Order By: Relevance
“…Recently, landmark detection on faces and bodys has attracted great attention from researchers, and methods in this area can be roughly divided into five branches: detection from global input to local regions [7,8], direct localization [9], auxiliary inputs [10,11,12], and feature refinement [13,14,15,16].…”
Section: Landmark Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, landmark detection on faces and bodys has attracted great attention from researchers, and methods in this area can be roughly divided into five branches: detection from global input to local regions [7,8], direct localization [9], auxiliary inputs [10,11,12], and feature refinement [13,14,15,16].…”
Section: Landmark Detectionmentioning
confidence: 99%
“…Class labels are auxiliary attributes to guide the landmark localization on unlabeled data [10]; adversarial attacks on manipulated faces are used and output landmarks are aggregated for robust landmark detection [11], and styleaggregated images are generated using CycleGAN [12]. Feature maps or keypoints are aligned for occluded or missing objects: convolutional feature maps are enhanced using layout-graph reasoning layers [13], low-rank learning module [14], or a fully convolutional network cascaded with a dilated one [15]; occluded keypoints are predicted using a graph encoder-decoder network [16] or corrected using geometric consistency between keypoint distributions [17].…”
Section: Landmark Detectionmentioning
confidence: 99%
“…For example, DeepFashion2 [5] defined 25 landmarks for "short sleeve shirt" and 10 landmarks for "shorts". Recent advances [4,22,23,24] in fashion landmark detection relied heavily on a huge amount of training data with manually annotated landmarks. For example, DeepFashion [4] first introduced this task, where an alignment network leveraged pseudo-labels and auto-routing mechanism to extract features for different landmarks.…”
Section: Fashion Landmark Detectionmentioning
confidence: 99%
“…Relationships between objects in an image or a video have recently been studied in visual relationship reasoning, and graph convolutional networks [21]- [24] and conditional random fields may be the dominant module to capture this information. Specifically, parts of historical target exemplary regions are considered to form a spatial-temporal graph, then corresponding features and context ones are adaptively refined with a graph convolutional network for object tracking [21].…”
Section: B Visual Relationship Reasoningmentioning
confidence: 99%
“…A fully-connected graph is formed and reasoned via graph convolutions to efficiently project global relationships between distant regions into an interaction space [23]. Layout-graph reasoning layers are designed and stacked to output fashion landmark heatmaps, then convolutional features are enhanced with operations including graph clustering and deconvolution [24]. A conditional random field is applied to jointly localize multiple landmarks by removing unimportant potentials and optimization [25].…”
Section: B Visual Relationship Reasoningmentioning
confidence: 99%