2024
DOI: 10.1109/tpami.2023.3327284
|View full text |Cite
|
Sign up to set email alerts
|

Temporal Action Segmentation: An Analysis of Modern Techniques

Guodong Ding,
Fadime Sener,
Angela Yao
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(6 citation statements)
references
References 138 publications
0
6
0
Order By: Relevance
“…It involves analyzing a video sequence's temporal structure to identify and mark the boundaries between different actions. The aim is to accurately segment and label each action or activity, allowing for a more detailed understanding and analysis of the sequence [18].…”
Section: Temporal Action Segmentation (Tas)mentioning
confidence: 99%
See 2 more Smart Citations
“…It involves analyzing a video sequence's temporal structure to identify and mark the boundaries between different actions. The aim is to accurately segment and label each action or activity, allowing for a more detailed understanding and analysis of the sequence [18].…”
Section: Temporal Action Segmentation (Tas)mentioning
confidence: 99%
“…Practically, TAS is implemented using precomputed frame-wise features as input because it avoids the more significant computational load required for learning video features [18]. In cases where the movement of objects is dynamic and there may be dependencies between video frames, a temporal or sequential model that uses a learning model such as a Convolutional Network, RNN, and Transformer is needed.…”
Section: Temporal Action Segmentation (Tas)mentioning
confidence: 99%
See 1 more Smart Citation
“…We further proposed a multi-stage structure with multi-level supervised contrastive loss to learn a well-structured embedding space for enhanced activity segmentation and recognition performance [33]. In parallel, action segmentation in video data has garnered considerable interest in both academia and industry [34]. MS-TCN [35] is a classic method in video action segmentation tasks, which achieves impressive segmentation results using dilated convolutions with residual connections.…”
Section: Fully Supervised Activity Segmentationmentioning
confidence: 99%
“…Traditional machine learning algorithms, that were dominant in human activity recognition until a few years ago [8,9], required extensive domain knowledge and time investment for feature engineering. In contrast, deep learning methods circumvent manual feature extraction through the hierarchical learning of complex data representations, although they demand significant data and computational resources [10].…”
Section: Introductionmentioning
confidence: 99%