2018
DOI: 10.1109/tip.2018.2839522
|View full text |Cite
|
Sign up to set email alerts
|

Object-Location-Aware Hashing for Multi-Label Image Retrieval via Automatic Mask Learning

Abstract: Learning-based hashing is a leading approach of approximate nearest neighbor search for large-scale image retrieval. In this paper, we develop a deep supervised hashing method for multi-label image retrieval, in which we propose to learn a binary "mask" map that can identify the approximate locations of objects in an image, so that we use this binary "mask" map to obtain length-limited hash codes which mainly focus on an image's objects but ignore the background. The proposed deep architecture consists of four… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 37 publications
(18 citation statements)
references
References 21 publications
0
18
0
Order By: Relevance
“…Moreover, feature aggregation such as sum/average, max pooling, and embedding (e.g., BoW [51], VLAD [22], FV [35]) is also applied to improve the discriminativeness and compactness of the representations. In fact, representation learning is still an ongoing topic, and numerous methods are proposed in recent years which are conducted either supervised [2,5,14] or non-supervised [41,56], and directly concatenated [4,42] or hash embedded [17,30]. We will skip these methods here because we have skipped representation learning in our framework.…”
Section: Deep Feature Extractionmentioning
confidence: 99%
“…Moreover, feature aggregation such as sum/average, max pooling, and embedding (e.g., BoW [51], VLAD [22], FV [35]) is also applied to improve the discriminativeness and compactness of the representations. In fact, representation learning is still an ongoing topic, and numerous methods are proposed in recent years which are conducted either supervised [2,5,14] or non-supervised [41,56], and directly concatenated [4,42] or hash embedded [17,30]. We will skip these methods here because we have skipped representation learning in our framework.…”
Section: Deep Feature Extractionmentioning
confidence: 99%
“…Recent techniques mainly focus on two parts: deep network architectures and training algorithms. The deep network architectures include single feedforward pass models [26], multiple feedforward pass models [27], attention based models [28], and deep hashing embedding based models [29]. While the training algorithms focus on classification based learning [30], metric based learning [31], and unsupervised-based learning [32].…”
Section: A Image Retrievalmentioning
confidence: 99%
“…Lai et al [33] learned instance-aware representations for the image data organized in groups for multiple labels with the assistant of Spatial Pyramid Pooling (SPP) layer [67]. Huang et al [68] proposed to focus on the objects instead of the background in an image by learning a binary mask map to identify the location of the objects.…”
Section: Related Workmentioning
confidence: 99%