2022
DOI: 10.3390/s22186971
|View full text |Cite
|
Sign up to set email alerts
|

Absolute Camera Pose Regression Using an RGB-D Dual-Stream Network and Handcrafted Base Poses

Abstract: Absolute pose regression (APR) for camera localization is a single-shot approach that encodes the information of a 3D scene in an end-to-end neural network. The camera pose result of APR methods can be observed as the linear combination of the base poses. Previous APR methods’ base poses are learned from training data. However, the training data can limit the performance of the methods, which cannot be generalized to cover the entire scene. To solve this issue, we use handcrafted base poses instead of learning… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 29 publications
0
3
0
Order By: Relevance
“…Compared to our model, the relative pose regression models are EssNet [31]. PoseNet [25], MapNet [32], DSNet [16], and GL-Net [33] are the absolute pose regression models evaluated with our own. Absolute pose estimation techniques recover the absolute camera posture from the input image in a straightforward and efficient manner using CNN learning, and this class of approaches is mostly a version of PoseNet [25].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Compared to our model, the relative pose regression models are EssNet [31]. PoseNet [25], MapNet [32], DSNet [16], and GL-Net [33] are the absolute pose regression models evaluated with our own. Absolute pose estimation techniques recover the absolute camera posture from the input image in a straightforward and efficient manner using CNN learning, and this class of approaches is mostly a version of PoseNet [25].…”
Section: Methodsmentioning
confidence: 99%
“…Researchers have discovered in recent years that neural networks can substitute random forests as correspondence learners and achieve excellent results for RGB-based camera relocalization [15]. As the backbone network, a dual-stream network (DSNet) structure is employed to handle picture data for precise pose estimation [16]. Deep feature aggregation module (DFAM) is used to combine contextual depth information into a general neural network to perform end-to-end posture estimation training while balancing the network's efficiency and light weight.…”
Section: Related Workmentioning
confidence: 99%
“…Gasperini et al [39] presented Panoster, a segmentation method for urban LiDAR data, vital for 3D urban modeling. The researchers also [40,41] proposed a methodology based on a dual-encoder network to process RGB and depth data, enhancing 3D urban scene perception.…”
Section: Literature Reviewmentioning
confidence: 99%