2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.01104
|View full text |Cite
|
Sign up to set email alerts
|

Learning Goals from Failure

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 31 publications
0
11
0
Order By: Relevance
“…dataset of unintentional actions, which mostly includes human actions. Models for automated detection of action intention have been developed from this dataset Epstein and Vondrick [42]. These and other examples are discussed in detail in Section 5.…”
Section: What Is Human Error?mentioning
confidence: 99%
See 1 more Smart Citation
“…dataset of unintentional actions, which mostly includes human actions. Models for automated detection of action intention have been developed from this dataset Epstein and Vondrick [42]. These and other examples are discussed in detail in Section 5.…”
Section: What Is Human Error?mentioning
confidence: 99%
“…Compared to a baseline model only trained on the Kinetics action recognition dataset [26], their model performed similarly when classifying intentional vs. unintentional actions when only using video speed as a predictive feature. Epstein and Vondrick [42] later built on this work by adding annotations to each of the videos with a decoder. Their 3D-CNN model outperformed a Kinetics-trained model.…”
Section: Unsupervised Learningmentioning
confidence: 99%
“…Previous works on unintentional action prediction or anomaly detection [11, 14, 28-30, 35, 40] do not address representation learning, but focus on predictions based on features pre-extracted from pre-trained networks. Epstein et al [9,10] instead proposed to learn features specifically for the tasks related to unintentional action prediction. For the unsupervised learning setting, Epstein et al [9] proposed three baselines: Video Speed, Video Sorting and Video Context.…”
Section: Self-supervised Learning For Unintentional Action Recognitionmentioning
confidence: 99%
“…For the unsupervised learning setting, Epstein et al [9] proposed three baselines: Video Speed, Video Sorting and Video Context. In the further work, Epstein et al [10] have also considered fully-supervised learning, for which they combined learning on labeled examples using the standard cross-entropy loss with solving an unsupervised temporal consistency task. Two further works [37,41] addressed the fully-supervised setting for the Oops dataset.…”
Section: Self-supervised Learning For Unintentional Action Recognitionmentioning
confidence: 99%
“…Our work is, to some degree, relevant to future prediction -a popular research area in computer vision. In this area, a huge spectrum of topics/tasks are put forward, including forecasting future frames [42,70], future features [62,69], future actions [1,28,31,55,61], future human motions [14,22,41], future human trajectories [2], future goals [13], etc. Rather than studying the future generation at the semantic-category or color-pixel level, event-level prediction was recently addressed in [34] and [49].…”
Section: Related Workmentioning
confidence: 99%