2022
DOI: 10.3390/s22124420
|View full text |Cite
|
Sign up to set email alerts
|

Few-Shot Text Classification with Global–Local Feature Information

Abstract: Meta-learning frameworks have been proposed to generalize machine learning models for domain adaptation without sufficient label data in computer vision. However, text classification with meta-learning is less investigated. In this paper, we propose SumFS to find global top-ranked sentences by extractive summary and improve the local vocabulary category features. The SumFS consists of three modules: (1) an unsupervised text summarizer that removes redundant information; (2) a weighting generator that associate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 33 publications
0
1
0
Order By: Relevance
“…There are two intuitive alternatives to achieve this goal: i) Annotating additional texts for each input video, which appears to be timeconsuming and expensive; ii) Constructing hand-crafted text prompts using the annotated action labels, which is intractable due to the inaccessible labels of the query video and high demands for professional domain knowledge (e.g., professional gymnastics). Besides, there are possible scenes that are difficult to annotate action names manually and only contain non-descript task labels, e.g., tasks with numerical labels [63,52]. The aforementioned potential drawbacks seriously hinder the application of recent multimodal foundation models in the few-shot action recognition field.…”
Section: Introductionmentioning
confidence: 99%
“…There are two intuitive alternatives to achieve this goal: i) Annotating additional texts for each input video, which appears to be timeconsuming and expensive; ii) Constructing hand-crafted text prompts using the annotated action labels, which is intractable due to the inaccessible labels of the query video and high demands for professional domain knowledge (e.g., professional gymnastics). Besides, there are possible scenes that are difficult to annotate action names manually and only contain non-descript task labels, e.g., tasks with numerical labels [63,52]. The aforementioned potential drawbacks seriously hinder the application of recent multimodal foundation models in the few-shot action recognition field.…”
Section: Introductionmentioning
confidence: 99%