2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00080
|View full text |Cite
|
Sign up to set email alerts
|

Thinking Outside the Pool: Active Training Image Creation for Relative Attributes

Abstract: Current wisdom suggests more labeled image data is always better, and obtaining labels is the bottleneck. Yet curating a pool of sufficiently diverse and informative images is itself a challenge. In particular, training image curation is problematic for fine-grained attributes, where the subtle visual differences of interest may be rare within traditional image sources. We propose an active image generation approach to address this issue. The main idea is to jointly learn the attribute ranking task while also … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 59 publications
0
7
0
Order By: Relevance
“…Interactive image search aims to incorporate user feedback as an interactive signal to navigate the visual search. In general, the user interaction can be given in various formats, including relative attribute [45,28,75], attribute [79,18,2], attribute-like modification text [66], natural language [16,17], spatial layout [37], and sketch [76,74,14]. As text is the most pervasive interaction between human and computer in contemporary search engines, it naturally serves to convey concrete information that elaborates user's intricate specification for image search.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Interactive image search aims to incorporate user feedback as an interactive signal to navigate the visual search. In general, the user interaction can be given in various formats, including relative attribute [45,28,75], attribute [79,18,2], attribute-like modification text [66], natural language [16,17], spatial layout [37], and sketch [76,74,14]. As text is the most pervasive interaction between human and computer in contemporary search engines, it naturally serves to convey concrete information that elaborates user's intricate specification for image search.…”
Section: Related Workmentioning
confidence: 99%
“…to refine or discover image items retrieved by the system [52,81,70,11,45,28,27,18,79,2,39,16,48,75,17]. Most of these interactions are delivered in the form of text, describing certain attributes [18,79,2] or relative attributes [45,28,75] to refine or modify upon a reference image. More recently, natural language feedback [17] is introduced as a more flexible way to convey users' intentions for interactive image search.…”
Section: Introductionmentioning
confidence: 99%
“…After their work was published, an SVM-based method was developed that focuses more on the local similarity of images [27]. Beyond these successes, network-based methods [23], [24], [28], [29] have performed better for learning relative attributes, especially Siamese-structured networks, which use a pair of image as inputs. Souri et al [23] developed a new Siamese-structured network that consists of convolutional feature extraction layers and ranking layers.…”
Section: Estimation Of Portrait Attributesmentioning
confidence: 99%
“…Because of the simplicity of the structure of the network and its high accuracy based on the RankNet algorithm [30], their method is frequently referenced as a baseline. Yu and Grauman [28], [29] attempted to increase the data by generating synthetic images based on attributes and to learn the discrimination using local similarity the same as the method [27]. In another approach, Meng et al [22] performed multi-task learning using a graph-based neural network.…”
Section: Estimation Of Portrait Attributesmentioning
confidence: 99%
“…The crowd counting model is computed by a joint regularization through learning the crowd pattern intrinsic distribution (geometric) structure and imposing temporal smoothness of activity patterns of the scene. The selected share attributes by the semantic relations [28] of visual attributes are used in zero-shot learning [29] and image retrieval [26]. In many learning tasks, the learning target space structure of the function holds rich relationship information that is the topology relations of the outputs of the learning function on the different inputs data points.…”
Section: Semi-supervised Learning For Attribute Recognitionmentioning
confidence: 99%