2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.01103
|View full text |Cite
|
Sign up to set email alerts
|

Home Action Genome: Cooperative Compositional Action Understanding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 49 publications
(22 citation statements)
references
References 22 publications
0
22
0
Order By: Relevance
“…[445] 2019 RGB 700 -650,317 -Kitchen20 [476] 2019 Au 20 -800 -MMAct [356] 2019 RGB,S,Ac,Gyr,etc. 37 20 36,764 4+Egocentric Moments in Time [477] 2019 RGB 339 -∼1,000,000 -Wang et al [478] 2019 WiFi CSI 6 1 1,394 -NTU RGB+D 120 [185] 2019 RGB,S,D,IR 120 106 114,480 155 ETRI-Activity3D [479] 2020 RGB,S,D 55 100 112,620 -EV-Action [357] 2020 RGB,S,D,EMG 20 70 7,000 9 IKEA ASM [480] 2020 RGB,S,D 33 48 16,764 3 RareAct [481] 2020 RGB 122 -905 -BABEL [482] 2021 Mocap 252 -13,220 -HAA500 [483] 2021 RGB 500 -10,000 -HOMAGE [484] 2021 RGB,IR,Ac,Gyr,etc. 75 27 1,752 2∼5 MultiSports [485] 2021 RGB 66 -37,701 -UAV-Human [486] 2021 RGB,S,D,IR,etc.…”
Section: Datasetsmentioning
confidence: 99%
“…[445] 2019 RGB 700 -650,317 -Kitchen20 [476] 2019 Au 20 -800 -MMAct [356] 2019 RGB,S,Ac,Gyr,etc. 37 20 36,764 4+Egocentric Moments in Time [477] 2019 RGB 339 -∼1,000,000 -Wang et al [478] 2019 WiFi CSI 6 1 1,394 -NTU RGB+D 120 [185] 2019 RGB,S,D,IR 120 106 114,480 155 ETRI-Activity3D [479] 2020 RGB,S,D 55 100 112,620 -EV-Action [357] 2020 RGB,S,D,EMG 20 70 7,000 9 IKEA ASM [480] 2020 RGB,S,D 33 48 16,764 3 RareAct [481] 2020 RGB 122 -905 -BABEL [482] 2021 Mocap 252 -13,220 -HAA500 [483] 2021 RGB 500 -10,000 -HOMAGE [484] 2021 RGB,IR,Ac,Gyr,etc. 75 27 1,752 2∼5 MultiSports [485] 2021 RGB 66 -37,701 -UAV-Human [486] 2021 RGB,S,D,IR,etc.…”
Section: Datasetsmentioning
confidence: 99%
“…Labeling work would be painful and error-prone for datasets not collected in controlled settings. Some of them were annotated by hand, like Fine-Gym, UAV-Human, HOMAGE [11,13,28], etc. Other datasets (e.g., Ac-tivityNet, AVA, Babel [12,17,29]) were labeled through commercial crowd-sourcing platforms like Amazon Mechanical Turk (AMT) [30], with a charge for dataset creators.…”
Section: Positive Impactsmentioning
confidence: 99%
“…Human activity recognition is attracting increasing attention. There is a large number of publicly available datasets [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]. It requires a lot of labor-intensive work to annotate the video datasets.…”
Section: Introductionmentioning
confidence: 99%
“…We recognize activities under domain shifts, caused by change of scenery, camera viewpoint or actor, with the aid of sound. 28,31,38,39,45,46,49]. For instance, both Gao et al [18] and Korbar et al [24] reduce the computational cost by previewing the audio track, while Lee et al [25] show that combining visual features with audio can better localize actions.…”
Section: Introductionmentioning
confidence: 99%