2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01023
|View full text |Cite
|
Sign up to set email alerts
|

Object Discovery in Videos as Foreground Motion Clustering

Abstract: We consider the problem of providing dense segmentation masks for object discovery in videos. We formulate the object discovery problem as foreground motion clustering, where the goal is to cluster foreground pixels in videos into different objects. We introduce a novel pixel-trajectory recurrent neural network that learns feature embeddings of foreground pixel trajectories linked across time. By clustering the pixel trajectories using the learned feature embeddings, our method establishes correspondences betw… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
47
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 60 publications
(47 citation statements)
references
References 50 publications
0
47
0
Order By: Relevance
“…For a fair comparison, we disable this component, applying tracking directly to the output of our two-stream model that detects only moving objects in each frame. Table 11 shows that our method significantly reduces the number of false positive segmentations compared to both [45,5] as evidenced by the improvement in ∆ Obj of nearly 65% (from 4 to 1.4), while performing competitively with [45] on F-measure.…”
Section: Fbms Moving Onlymentioning
confidence: 93%
“…For a fair comparison, we disable this component, applying tracking directly to the output of our two-stream model that detects only moving objects in each frame. Table 11 shows that our method significantly reduces the number of false positive segmentations compared to both [45,5] as evidenced by the improvement in ∆ Obj of nearly 65% (from 4 to 1.4), while performing competitively with [45] on F-measure.…”
Section: Fbms Moving Onlymentioning
confidence: 93%
“…As shown in Table 5, the proposed model performs the best in recall. In terms of recall and F-measure, it outperforms CCG [18], OBV [19], and STB [50] by over 16.5% and 6.4%, respectively. The qualitative results are shown in Figure 6.…”
Section: Comparison With Prior Workmentioning
confidence: 97%
“…However, these methods cannot segment each moving object instance. More recent approaches have used optical-flow-based methods for the instance-level moving object segmentation, including a hierarchical motion segmentation system that combines geometric knowledge with a modern CNN for appearance modeling [18], a novel pixel-trajectory recurrent neural network to cluster foreground pixels in videos into different objects [19], a two-stream architecture to separately process motion and appearance [20], a new submodular optimization process to achieve trajectory clustering [50], and a statistical-inference-based method for the combination of motion and semantic cues [21]. In comparison, we propose a two-level nested U-structure deep network with octave convolution to segment each moving object instance while reducing the spatial redundancy and memory cost in the CNN.…”
Section: Motion Segmentationmentioning
confidence: 99%
See 2 more Smart Citations