2021
DOI: 10.48550/arxiv.2104.03214
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Self-Supervised Learning for Semi-Supervised Temporal Action Proposal

Abstract: Self-supervised learning presents a remarkable performance to utilize unlabeled data for various video tasks. In this paper, we focus on applying the power of selfsupervised methods to improve semi-supervised action proposal generation. Particularly, we design an effective Selfsupervised Semi-supervised Temporal Action Proposal (SSTAP) framework. The SSTAP contains two crucial branches, i.e., temporal-aware semi-supervised branch and relation-aware self-supervised branch. The semisupervised branch improves the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 47 publications
0
7
0
Order By: Relevance
“…Channel-Separated Convolutional Network (CSN) [13] aims to reduce the parameters of 3D convolution, and extract useful information by finding important channels simultaneously. It can efficiently learn feature representation Note that this module is borrowed from our SSTAP [18].…”
Section: Channel-separated Convolutional Networkmentioning
confidence: 99%
See 3 more Smart Citations
“…Channel-Separated Convolutional Network (CSN) [13] aims to reduce the parameters of 3D convolution, and extract useful information by finding important channels simultaneously. It can efficiently learn feature representation Note that this module is borrowed from our SSTAP [18].…”
Section: Channel-separated Convolutional Networkmentioning
confidence: 99%
“…Temporal shift operation for action recognition is first applied in TSM [9], and then applied as a kind of perturbations in SSTAP [18] for semi-supervised learning. Here we reuse the perturbation as the feature augmentation.…”
Section: Data Augmentation Modulementioning
confidence: 99%
See 2 more Smart Citations
“…Following mainstream action proposal generation methods [16,17,15,25,10,5,2,26,19,20,23,24], we preextract features for each video. Specifically, for a video which contains l frames, the whole video can be divided into N clips uniformly.…”
Section: Extracting Featuresmentioning
confidence: 99%