2022
DOI: 10.3390/s22218331
|View full text |Cite
|
Sign up to set email alerts
|

Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images

Abstract: Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene; furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to have a non-negligible capability in image reconstruction, where the masking mechanism that randomly drops patches forces the model to build unknown pixels from known pixels. Inspired by this self-supervised learning m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…Marlin et al [97] (2023.6) also employ MAE for learning facial features, with a focus on facial details and facial features. PoseMask [98] (2022.10) applies MAE for pose estimation in classroom scenarios, using heatmaps as reference masks to estimate poses in crowded or occluded scenes. Furthermore, Sheng et al [99] (2022.10) use MAE for feature extraction of gestures in Spatial-Temporal Motion Maps (STMM), improving gesture recognition accuracy.…”
Section: B Real-world and Unmodified Imagesmentioning
confidence: 99%
See 1 more Smart Citation
“…Marlin et al [97] (2023.6) also employ MAE for learning facial features, with a focus on facial details and facial features. PoseMask [98] (2022.10) applies MAE for pose estimation in classroom scenarios, using heatmaps as reference masks to estimate poses in crowded or occluded scenes. Furthermore, Sheng et al [99] (2022.10) use MAE for feature extraction of gestures in Spatial-Temporal Motion Maps (STMM), improving gesture recognition accuracy.…”
Section: B Real-world and Unmodified Imagesmentioning
confidence: 99%
“…This means that even if the reconstructed results differ from the original, they still possess a certain level of coherence and can connect with contextual information. Therefore, MAE has significant applications in image generation [98], [104], [110], and also performs well in tasks that require temporal coherence [4], [135].…”
Section: B Applicationsmentioning
confidence: 99%