2020
DOI: 10.1007/978-3-030-58621-8_10
|View full text |Cite
|
Sign up to set email alerts
|

STEm-Seg: Spatio-Temporal Embeddings for Instance Segmentation in Videos

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
177
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 142 publications
(177 citation statements)
references
References 76 publications
0
177
0
Order By: Relevance
“…Such scenarios usually require the algorithm to be more robust. As the test set evaluation server is closed, we followed most previous works [ 3 , 10 , 11 , 13 , 14 ] and evaluated our method on the validation set. The evaluation metrics are Average Precision ( ) calculated based on multiple intersection-over-union (IoU) thresholds and Average Recall ( ) defined as the maximum recall given some fixed number of segmented instances per video.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Such scenarios usually require the algorithm to be more robust. As the test set evaluation server is closed, we followed most previous works [ 3 , 10 , 11 , 13 , 14 ] and evaluated our method on the validation set. The evaluation metrics are Average Precision ( ) calculated based on multiple intersection-over-union (IoU) thresholds and Average Recall ( ) defined as the maximum recall given some fixed number of segmented instances per video.…”
Section: Resultsmentioning
confidence: 99%
“…The gap of accuracy between our method and MaskProp [ 11 ] is mainly due to the fact that Maskprop combines multiple networks and post-processing strategies which are actually time-consuming. For STEm-Seg [ 13 ] and VisTR [ 14 ], both methods process the entire video sequence at the same time. Such methods usually impose limits on video resolution and length, which are difficult to extend to online applications.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Our method also has other advantages compared to the other methods. Unlike object detection and tracking methods [12], [21], [31], which require prior knowledge of known objects (object proposals from pre-trained Imagenet models) to solve UVOS accurately, our method does not depend on any prior knowledge of the segmented objects (as we only use affinities to perform UVOS). Hence our model is capable of generalizing well to unseen object classes.…”
Section: ) Davis16mentioning
confidence: 99%