2022
DOI: 10.1109/access.2022.3175871
|View full text |Cite
|
Sign up to set email alerts
|

Pose Detection of Aerial Image Object Based on Constrained Neural Network

Abstract: Constraints often exist in the high-dimensional data output in object detection, such as the inverse vector {cos  , sin  } of the two-dimensional object and the attitude quaternion of the threedimensional object. The range of each component of the output value of the traditional neural network is unconstrained, which is difficult to meet the needs of practical problems. To solve this problem, this paper designed the transformation network layer according to the high dimensional space transformation theory an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 28 publications
0
0
0
Order By: Relevance
“…The authors in [27] compare Convolution Neural Network-Long Short-Term Memory (CNN-LSTM) networks with other models such as Multilayer Perceptron (MLP), Long-term Recurrent Convolutional Networks (LRCN) or LSTM for classifying the dance pose by excerpted salient details attaining high-performance results, up to 98%, in some metrics as accuracy, precision, recall, AUC or F1 score. The author in [28] uses a CNN model called the quaternion field pose network (qfiled PoseNet) to detect the pose of objects from a single aerial image with good results, as demonstrated in experiment on the DOTA1.5 and HSRC2016 datasets. At the same time, the authors in [29] present PoseFormer, a purely transformer-based approach for 3D human pose estimation in videos without convolutional architectures involved, which achieves state-of-the-art performance on two benchmark datasets, Human3.6M and MPI-INF-3DHP, according to extensive experiments.…”
Section: Human Pose Detectorsmentioning
confidence: 99%
“…The authors in [27] compare Convolution Neural Network-Long Short-Term Memory (CNN-LSTM) networks with other models such as Multilayer Perceptron (MLP), Long-term Recurrent Convolutional Networks (LRCN) or LSTM for classifying the dance pose by excerpted salient details attaining high-performance results, up to 98%, in some metrics as accuracy, precision, recall, AUC or F1 score. The author in [28] uses a CNN model called the quaternion field pose network (qfiled PoseNet) to detect the pose of objects from a single aerial image with good results, as demonstrated in experiment on the DOTA1.5 and HSRC2016 datasets. At the same time, the authors in [29] present PoseFormer, a purely transformer-based approach for 3D human pose estimation in videos without convolutional architectures involved, which achieves state-of-the-art performance on two benchmark datasets, Human3.6M and MPI-INF-3DHP, according to extensive experiments.…”
Section: Human Pose Detectorsmentioning
confidence: 99%