2022
DOI: 10.1109/tpami.2021.3116945
|View full text |Cite
|
Sign up to set email alerts
|

VideoDG: Generalizing Temporal Relations in Videos to Novel Domains

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(10 citation statements)
references
References 34 publications
0
10
0
Order By: Relevance
“…Several techniques have been introduced to solve this problem with deep models (Muandet et al, 2013;Li et al, 2017Li et al, , 2018aMotiian et al, 2017), and with important results for a variety of datasets and data types, but the area is significantly under-explored with respect to video datasets, due to the complexity of entangling spatial and temporal domain shifts. In Yao et al (2019Yao et al ( , 2021, the only recent prominent work in this area, the authors present the Adversarial Pyramid Network (APN), a network capturing the videos' local-, global-, and multi-layer crossrelation features. They also extend an adversarial data augmentation method in Volpi et al (2018), ADA, to videos.…”
Section: Video Domain Generalizationmentioning
confidence: 99%
“…Several techniques have been introduced to solve this problem with deep models (Muandet et al, 2013;Li et al, 2017Li et al, , 2018aMotiian et al, 2017), and with important results for a variety of datasets and data types, but the area is significantly under-explored with respect to video datasets, due to the complexity of entangling spatial and temporal domain shifts. In Yao et al (2019Yao et al ( , 2021, the only recent prominent work in this area, the authors present the Adversarial Pyramid Network (APN), a network capturing the videos' local-, global-, and multi-layer crossrelation features. They also extend an adversarial data augmentation method in Volpi et al (2018), ADA, to videos.…”
Section: Video Domain Generalizationmentioning
confidence: 99%
“…Domain shift in action recognition. In [6,52], crossdomain datasets are introduced to study methods for video domain adaptation. Chen et al [6] propose to align temporal and spatial features across the domains, whereas Yao et al [52] propose to improve the generalizability of so called local features instead of global features, and use a novel augmentation scheme.…”
Section: Related Workmentioning
confidence: 99%
“…In [6,52], crossdomain datasets are introduced to study methods for video domain adaptation. Chen et al [6] propose to align temporal and spatial features across the domains, whereas Yao et al [52] propose to improve the generalizability of so called local features instead of global features, and use a novel augmentation scheme. Strikingly, however, all experiments in [6,52] are based on features extracted frameby-frame, by a ResNet [21], and aggregated after-the-fact, which means that they in effect do not handle spatiotemporal features.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations