2020
DOI: 10.1109/tip.2020.2967577
|View full text |Cite
|
Sign up to set email alerts
|

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Abstract: With the prevalence of RGB-D cameras, multimodal video data have become more available for human action recognition. One main challenge for this task lies in how to effectively leverage their complementary information. In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition. We regard RGB/optical flow videos as source modalities, skeletons as auxiliary modality. Our goal is to extract more di… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(29 citation statements)
references
References 60 publications
0
29
0
Order By: Relevance
“…Specifically, the teacher network was trained with RGB videos, providing supervision information for the student network handling skeleton data. Song et al [397] proposed a Modality Compensation Network (MCN) taking advantage of the skeleton modality to compensate the feature learning of the RGB modality with adaptive representation learning, and a modality adaptation block with residual feature learning was designed to bridge information between modalities.…”
Section: Co-learning With Visual Modalitiesmentioning
confidence: 99%
“…Specifically, the teacher network was trained with RGB videos, providing supervision information for the student network handling skeleton data. Song et al [397] proposed a Modality Compensation Network (MCN) taking advantage of the skeleton modality to compensate the feature learning of the RGB modality with adaptive representation learning, and a modality adaptation block with residual feature learning was designed to bridge information between modalities.…”
Section: Co-learning With Visual Modalitiesmentioning
confidence: 99%
“…Due to the heterogeneous, cluttered and dynamic background network gets confused with the action classes. Several attention and skeleton modality based approaches have been employed to generate discriminative features [15] [14]. Recently some works focused on extracting highly discriminative features using Extreme Learning Machine (ELM) [66].…”
Section: B Discriminative Feature Learningmentioning
confidence: 99%
“…Because of the noisy motion and appearance of unimportant infor-mation, close-fitted inter-class discrimination and extensive intra-class discrimination make this task quite challenging [8] [9]. Decreasing the feature discrimination between the intra-class and increasing the discrimination between interclass features can be one of the effective solutions for the concerned issue [14].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations