2021
DOI: 10.1049/ell2.12283
|View full text |Cite
|
Sign up to set email alerts
|

Few‐shot action recognition using task‐adaptive parameters

Abstract: Few-shot action recognition aims to recognise unseen actions given a few examples. Thus, this letter proposes a model named meta relation network (Meta RN) to address such problem. This model contains two parts: a MetaNet and a relation network. Relation network is utilised to extract video features and classify actions. A second-order pooling followed by power normalization is used for feature enhancement, and target videos are finally classified by exploring nonlinear distance relations. The MetaNet module i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 6 publications
0
7
0
Order By: Relevance
“…Prototypical networks [33], when faced with large K values, average the features of the support set belonging to the same class; the average value serves as the prototype of the class, which is then matched with the query set. Prototypes not only weaken the impact of noise, but also increase the computational efficiency, making them the most widely used architecture for existing metric-based few-shot learning [11,15,17,23,35]. We adopt a prototypical network, i.e., a metric-based framework, with episodic training, but our focus is the challenging task of few-shot action recognition.…”
Section: Few-shot Learningmentioning
confidence: 99%
See 4 more Smart Citations
“…Prototypical networks [33], when faced with large K values, average the features of the support set belonging to the same class; the average value serves as the prototype of the class, which is then matched with the query set. Prototypes not only weaken the impact of noise, but also increase the computational efficiency, making them the most widely used architecture for existing metric-based few-shot learning [11,15,17,23,35]. We adopt a prototypical network, i.e., a metric-based framework, with episodic training, but our focus is the challenging task of few-shot action recognition.…”
Section: Few-shot Learningmentioning
confidence: 99%
“…Compared with 2D CNNs, 3D CNNs, such as C3D [38] and SlowFast [39], can more accurately capture information in time and space, recognizing actions and changes in videos. Currently, the mainstream few-shot action recognition methods [11,15,17,35] that adopt 3D CNNs are trained from scratch directly on the target dataset. The focus of our work is not on 2D or 3D CNNs, but on hierarchical task knowledge mining.…”
Section: Few-shot Action Recognitionmentioning
confidence: 99%
See 3 more Smart Citations