2020
DOI: 10.48550/arxiv.2007.08139
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Interactive Video Object Segmentation Using Global and Local Transfer Modules

Abstract: An interactive video object segmentation algorithm, which takes scribble annotations on query objects as input, is proposed in this paper. We develop a deep neural network, which consists of the annotation network (A-Net) and the transfer network (T-Net). First, given user scribbles on a frame, A-Net yields a segmentation result based on the encoder-decoder architecture. Second, T-Net transfers the segmentation result bidirectionally to the other frames, by employing the global and local transfer modules. The … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 52 publications
0
3
0
Order By: Relevance
“…Oh et al [38] measure a 44% increase in error rate when extra training data is omitted, indicating that methods with many parameters are data-hungry and underperform without the help of additional data. Method AUC J AUC J &F Extra data Heo et al [39] 0.771 0.809 Heo et al [47] 0.704 -Oh et al [38] 0.691 -Miao et al [40] 0.749 --Oh et al [48] 0.702 --Oh et al [38] 0 In order to gain some insight into the data structure, we show the error rate of prediction in case of individual videos in Fig. 4 after the second and the eighth interaction steps.…”
Section: B Interactive Vos Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Oh et al [38] measure a 44% increase in error rate when extra training data is omitted, indicating that methods with many parameters are data-hungry and underperform without the help of additional data. Method AUC J AUC J &F Extra data Heo et al [39] 0.771 0.809 Heo et al [47] 0.704 -Oh et al [38] 0.691 -Miao et al [40] 0.749 --Oh et al [48] 0.702 --Oh et al [38] 0 In order to gain some insight into the data structure, we show the error rate of prediction in case of individual videos in Fig. 4 after the second and the eighth interaction steps.…”
Section: B Interactive Vos Resultsmentioning
confidence: 99%
“…A slight deficiency of their method is the lack of weight sharing between the two networks in the convolutional layers. Heo et al achieve superior results with their feature information transfer modules [39]. A drawback of their method is the need to use multiple additional segmentation datasets for their training process.…”
Section: Interactive Video Object Segmentationmentioning
confidence: 99%
See 1 more Smart Citation