2020
DOI: 10.1609/aaai.v34i07.7008
|View full text |Cite
|
Sign up to set email alerts
|

Motion-Attentive Transition for Zero-Shot Video Object Segmentation

Abstract: In this paper, we present a novel Motion-Attentive Transition Network (MATNet) for zero-shot video object segmentation, which provides a new way of leveraging motion information to reinforce spatio-temporal object representation. An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder, which transforms appearance features into motion-attentive representations at each convolutional stage. In this way, the encoder becomes deeply interleaved, allowing for c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
121
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 173 publications
(141 citation statements)
references
References 38 publications
0
121
0
Order By: Relevance
“…The contrast between different features is enhanced and useful features are more prominent. Many attention mechanisms such as MAT [22], IHSM&EFRM [23], CBAM [24], SE [25], and ECA [26] have been applied in various visual tasks. MAT module consists of a soft attention unit and an attention transition unit, which allows the transition of attentive motion features to enhance appearance learning at each convolution stage and enrich spatio-temporal object features.…”
Section: Eca Attention Mechanismmentioning
confidence: 99%
“…The contrast between different features is enhanced and useful features are more prominent. Many attention mechanisms such as MAT [22], IHSM&EFRM [23], CBAM [24], SE [25], and ECA [26] have been applied in various visual tasks. MAT module consists of a soft attention unit and an attention transition unit, which allows the transition of attentive motion features to enhance appearance learning at each convolution stage and enrich spatio-temporal object features.…”
Section: Eca Attention Mechanismmentioning
confidence: 99%
“…Wang et al [36] focused on the relations between pixels across different images. Zhou et al [37] used the optical flow and attention mechanism to segment the video objects; image co-segmentation requires a set of images containing objects from the same category as a weak form of supervision. Rother et al [38] minimised an energy function to segment image pairs.…”
Section: Image Segmentation From Unlabeled Datamentioning
confidence: 99%
“…The crowd counting tasks have attracted the interest of many researchers in the field of computer vision because crowd counting has a high value in a number of important practical issues such as group safety warning [8] [46] and the perception of the traffic domain [17] and so on.…”
Section: A Crowd Countingmentioning
confidence: 99%