Faster Visual-Based Localization with Mobile-PoseNet

Cimarelli, Claudio; Cazzato, Dario; Olivares-Méndez, Miguel A.; Voos, Holger

doi:10.1007/978-3-030-29891-3_20

Cited by 6 publications

(4 citation statements)

References 33 publications

(53 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Visual-based localization methods can generally be categorized into regression-based, structure-based, and image retrieval-based methods. Regression-based methods include end-to-end visual localization models trained by deep learning that are able to directly obtain the regressed 6DoF camera pose [ 17 , 18 , 19 , 20 ]. However, such methods are not applicable for the visual localization of large-scale scenes and are associated with low accuracies [ 21 ].…”

Section: Related Workmentioning

confidence: 99%

A Visual and VAE Based Hierarchical Indoor Localization Method

Jiang

Zou

Chen

et al. 2021

Sensors

View full text Add to dashboard Cite

Precise localization and pose estimation in indoor environments are commonly employed in a wide range of applications, including robotics, augmented reality, and navigation and positioning services. Such applications can be solved via visual-based localization using a pre-built 3D model. The increase in searching space associated with large scenes can be overcome by retrieving images in advance and subsequently estimating the pose. The majority of current deep learning-based image retrieval methods require labeled data, which increase data annotation costs and complicate the acquisition of data. In this paper, we propose an unsupervised hierarchical indoor localization framework that integrates an unsupervised network variational autoencoder (VAE) with a visual-based Structure-from-Motion (SfM) approach in order to extract global and local features. During the localization process, global features are applied for the image retrieval at the level of the scene map in order to obtain candidate images, and are subsequently used to estimate the pose from 2D-3D matches between query and candidate images. RGB images only are used as the input of the proposed localization system, which is both convenient and challenging. Experimental results reveal that the proposed method can localize images within 0.16 m and 4° in the 7-Scenes data sets and 32.8% within 5 m and 20° in the Baidu data set. Furthermore, our proposed method achieves a higher precision compared to advanced methods.

show abstract

Section: Related Workmentioning

confidence: 99%

A Visual and VAE Based Hierarchical Indoor Localization Method

Jiang

Zou

Chen

et al. 2021

Sensors

View full text Add to dashboard Cite

show abstract

“…The median position/orientation error for the U-SURF descriptors of features sizes 300, 100, 50, 10, 1 along with the state of the art previous work including PoseNet [19], G-Posenet [41], Posenet-U [40], Pose-L [43], Branch-Net [42] Pose-H [52], VidLoc [54], RelocNet [55] and Mobile-PoseNet [56] for the 7 scenes dataset are shown in Table 3. As shown, with 300 U-SURF features, our system displays 0.28 m position error and 9.17 • orientation error.…”

Section: Performance Analysismentioning

confidence: 99%

“…We also extend this work on our locally generated data using a robot mounted with multiple sensors including TABLE 3. The median error in position (m)/ orientation (degrees) for the 7 scenes, with 5 input sizes 300 × 64, 100 × 64, 50 × 64, 10 × 64 and 1 × 64, compared with PoseNet [19], G-Posenet [41], Posenet-U [40], Pose-L [43], BranchNet [42], Pose-H [52], VidLoc [54], RelocNet [55] and Mobile-PoseNet [56]. camera, LIDAR, odometer and SONAR to collect RGB images with ground truth poses.…”

Section: Performance Analysismentioning

confidence: 99%

SurfCNN: A Descriptor Accelerated Convolutional Neural Network for Image-Based Indoor Localization

et al. 2020

View full text Add to dashboard Cite

Convolutional neural network (CNN) is a powerful tool for many data applications. However, its high dimension nature, large network size and computational complexity, and the need of large amount of training data make it challenging to be used in edge computing applications, which are becoming increasingly popular, relevant and important. In this paper, we propose a descriptor based approach to accelerate convolutional neural network training, reduce input dimension and network size, which greatly facilitates the use of CNN for edge computating and even cloud computing. By using image descriptors to extract features from original images, we report a simpler convolutional neural network with fast training speed, low memory usage and outstanding accuracy without the need for a pre-trained network as opposed to the state of art. In indoor localization, our SURF descriptors accelerated CNN (SurfCNN) can reach an average position error of 0.28 m and orientation error of 9.2 •. Compared to the conventional CNN that uses original images as input, our algorithm reduces the dimension of the input features by a factor of 48 without impairing the accuracy. Further, at an extreme feature reduction of 14,440 times, our model still retains an average position error retained at 0.41 m and orientation error at 14 • .

show abstract

“…Visual-based localization (VBL) [1][2][3] is one of the promising self-localization technologies that have received growing interest. VBL identifies a device's location in a target space by using cameras to see the device's surroundings, without the dependency on GPS which is designed for outdoor usage or active indoor radio anchor devices which are subject to signal bouncing and interference.…”

Section: Introductionmentioning

confidence: 99%

PDPose: indoor robot localization and obstacle avoidance based on visual object detection

Meng

Qu²

2023

Third International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI 2022)

View full text Add to dashboard Cite

Localization and obstacle avoidance are important problems for indoor robots. Visual-based localization (VBL) is a promising self-localization approach that identifies a device's location in a 3D space by using cameras to see the device's surrounding scenes and objects. In this paper, we present a pictorial planar surface based 3D object localization framework. However, the image shaking on moving robot leads to localization accuracy reducing. In order to improve the localization accuracy on moving robot, the depth information from RGBD camera is involved to correct the pose calculation. Furthermore, in order to produce a more acceptable decision on obstacle avoidance, we also design an optimal path planning using RGBD camera based object detection. We have built an autonomous moving robot that can self-localize using its on-board camera and the PDPose (Picture Depth Pose) technology. The experiment study shows that our localization methods are practical, have a very good accuracy, and can be used for real time robot navigation. Moreover, compared with the traditional obstacle method, the optimal obstacle method produces better path planting result.

show abstract

Faster Visual-Based Localization with Mobile-PoseNet

Cited by 6 publications

References 33 publications

A Visual and VAE Based Hierarchical Indoor Localization Method

A Visual and VAE Based Hierarchical Indoor Localization Method

SurfCNN: A Descriptor Accelerated Convolutional Neural Network for Image-Based Indoor Localization

PDPose: indoor robot localization and obstacle avoidance based on visual object detection

Contact Info

Product

Resources

About