2018
DOI: 10.1007/978-3-030-01231-1_3
|View full text |Cite
|
Sign up to set email alerts
|

Key-Word-Aware Network for Referring Expression Image Segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
107
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 148 publications
(108 citation statements)
references
References 29 publications
1
107
0
Order By: Relevance
“…It is difficult to directly exploit F to produce the segmentation output. In recent years, the attention mechanism [9,22,23,26,28] has been shown to be a powerful technique that can capture important information from raw features in either linguistic or visual representation. Different from above works, we propose a cross-modal self-attention module to jointly exploit attentions over multimodal features.…”
Section: Cross-modal Self-attentionmentioning
confidence: 99%
See 1 more Smart Citation
“…It is difficult to directly exploit F to produce the segmentation output. In recent years, the attention mechanism [9,22,23,26,28] has been shown to be a powerful technique that can capture important information from raw features in either linguistic or visual representation. Different from above works, we propose a cross-modal self-attention module to jointly exploit attentions over multimodal features.…”
Section: Cross-modal Self-attentionmentioning
confidence: 99%
“…A popular approach (e.g. [10,15,22]) in this area is to * Zhi Liu and Yang Wang are the corresponding authors Figure 1. (Best viewed in color) Illustration of our cross-modal self-attention mechanism.…”
Section: Introductionmentioning
confidence: 99%
“…Their work uses users' click-through data. A new paradigm of attention-based mechanisms for referring expressions in image segmentation [30] is proposed which contains a keyword-aware network and query attention model that demonstrates the relationships with various image regions for a given query. Inspired by the idea of attention models, we modify this mechanism for patch alignments within images via information scent in the following section.…”
Section: Query Suggestion In Image Searchmentioning
confidence: 99%
“…Hence, finding useful patches for query expansion in an image based on textual queries (or descriptions) is the primary focus of our work. Past work [11,30] used both the query and image for typical retrieval and segmentation tasks. In our task formulation, we rely only upon a given arbitrary text prefix rather than having the entire text query which is used to perform search based on the image and supported by a modified deep language model [12] to find the most relevant patch in the image.…”
Section: Introductionmentioning
confidence: 99%
“…Interactions between two modalities can associate each word with each image region, and further highlight the features of the target object for accurate segmentation. Recent works proposed to model the interactions by introducing unidirectional [8,9] or bi-directional [10] attention mechanisms.…”
Section: Introductionmentioning
confidence: 99%