Invariant Representation Learning for Infant Pose Estimation with Small Data

Huang, Xiaofei; Fu, Nihang; Liu, Shuangjun; Ostadabbas, Sarah

doi:10.1109/fg52635.2021.9666956

Cited by 28 publications

(11 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Besides, none of these datasets proposed suitable keypoints annotation for infant images, as they adopt the COCO's 17 keypoints format, while it loses many significant refined pose and movement features for the infant. Inspired by [Silva et al, 2021;Huang et al, 2021], we publish our new open-source infant pose dataset and new infant keypoints format. To collect data, we adopt GMA devices to record infant movement videos from 2013 to now.…”

Section: Infant Pose Detection Datasetmentioning

confidence: 99%

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

Ghosh

Ekbal

Bhattacharyya

2022

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

The World Health Organization (WHO) has emphasized the importance of significantly accelerating suicide prevention efforts to fulfill the United Nations' Sustainable Development Goal (SDG) objective of 2030. In this paper, we present an end-to-end multitask system to address a novel task of detection of two interpersonal risk factors of suicide, Perceived Burdensomeness (PB) and Thwarted Belongingness (TB) from suicide notes. We also introduce a manually translated code-mixed suicide notes corpus, CoMCEASE-v2.0, based on the benchmark CEASE-v2.0 dataset, annotated with temporal orientation, PB and TB labels. We exploit the temporal orientation and emotion information in the suicide notes to boost overall performance. For comprehensive evaluation of our proposed method, we compare it to several state-of-the-art approaches on the existing CEASE-v2.0 dataset and the newly announced CoMCEASE-v2.0 dataset. Empirical evaluation suggests that temporal and emotional information can substantially improve the detection of PB and TB.

show abstract

Section: Infant Pose Detection Datasetmentioning

confidence: 99%

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

Ghosh

Ekbal

Bhattacharyya

2022

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

show abstract

“…Inspired by [Silva et al, 2021;Huang et al, 2021], we publish our new open-source infant pose dataset and new infant keypoints format. To collect data, we adopt GMA devices to record infant movement videos from 2013 to now.…”

Section: Infant Pose Detection Datasetmentioning

confidence: 99%

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

Cao,

Li,

et al. 2022

Preprint

View full text Add to dashboard Cite

Movement and pose assessment of newborns lets experienced pediatricians predict neurodevelopmental disorders, allowing early intervention for related diseases. However, most of the newest AI approaches for human pose estimation methods focus on adults, lacking publicly benchmark for infant pose estimation. In this paper, we fill this gap by proposing infant pose dataset and Deep Aggregation Vision Transformer for human pose estimation, which introduces a fast trained full transformer framework without using convolution operations to extract features in the early stages. It generalizes Transformer + MLP to high-resolution deep layer aggregation within feature maps, thus enabling information fusion between different vision levels. We pre-train AggPose on COCO pose dataset and apply it on our newly released large-scale infant pose estimation dataset. The results show that AggPose could effectively learn the multi-scale features among different resolutions and significantly improve the performance of infant pose estimation. We show that AggPose outperforms hybrid model HRFormer and TokenPose in the infant pose estimation dataset. Moreover, our AggPose outperforms HRFormer by 0.7% AP on COCO val pose estimation on average. Our code is available at github.com/SZAR-LAB/AggPose.

show abstract

“…Therefore, the majority of the existing infant datasets are synthetic images. Currently, there are only limited infant-related datasets: MINI-RGBD [23], SyRIP [24], and Zhou et al [25]. MINI-RGBD mapped real infant movements to the SMIL model, generating RGB and depth video sequences with 2D and 3D joint coordinates.…”

Section: Infant Datasetmentioning

confidence: 99%

“…where K and [R|T] are the pre-defined camera intrinsic and extrinsic parameters. We extend the real portion of the SyRIP dataset [24] by annotating and categorizing the real infant portion into 12 selected fine-level gross motor poses, a very small portion (≈ 5%) of samples are withdrawn due to the poses not falling into any of defined fine-level poses. We randomly assign different camera parameters and remove those unnatural samples after syntheses.…”

Section: Renderingmentioning

confidence: 99%

Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose Recognition with Synthetic Data

Yang¹,

Jiang²,

Gu³

et al. 2022

Preprint

View full text Add to dashboard Cite

The Alberta Infant Motor Scale (AIMS) is a well-known assessment scheme that evaluates the gross motor development of infants by recording the number of specific poses achieved. With the aid of the image-based pose recognition model, the AIMS evaluation procedure can be shortened and automated, providing early diagnosis or indicator of potential developmental disorder. Due to limited public infant-related datasets, many works use the SMIL-based method to generate synthetic infant images for training. However, this domain mismatch between real and synthetic training samples often leads to performance degradation during inference. In this paper, we present a CNN-based model which takes any infant image as input and predicts the coarse and fine-level pose labels. The model consists of an image branch and a pose branch, which respectively generates the coarse-level logits facilitated by the unsupervised domain adaptation and the 3D keypoints using the HRNet with SMPLify optimization. Then the outputs of these branches will be sent into the hierarchical pose recognition module to estimate the fine-level pose labels. We also collect and label a new AIMS dataset, which contains 750 real and 4000 synthetic infants images with AIMS pose labels. Our experimental results show that the proposed method can significantly align the distribution of synthetic and real-world datasets, thus achieving accurate performance on fine-grained infant pose recognition.

show abstract

Invariant Representation Learning for Infant Pose Estimation with Small Data

Cited by 28 publications

References 17 publications

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose Recognition with Synthetic Data

Contact Info

Product

Resources

About