Indoor localization is one of the fundamentals of location-based services (LBS) such as seamless indoor and outdoor navigation, location-based precision marketing, spatial cognition of robotics, etc. Visual features take up a dominant part of the information that helps human and robotics understand the environment, and many visual localization systems have been proposed. However, the problem of indoor visual localization has not been well settled due to the tough trade-off of accuracy and cost. To better address this problem, a localization method based on image retrieval is proposed in this paper, which mainly consists of two parts. The first one is CNN-based image retrieval phase, CNN features extracted by pre-trained deep convolutional neural networks (DCNNs) from images are utilized to compare the similarity, and the output of this part are the matched images of the target image. The second one is pose estimation phase that computes accurate localization result. Owing to the robust CNN feature extractor, our scheme is applicable to complex indoor environments and easily transplanted to outdoor environments. The pose estimation scheme was inspired by monocular visual odometer, therefore, only RGB images and poses of reference images are needed for accurate image geo-localization. Furthermore, our method attempts to use lightweight datum to present the scene. To evaluate the performance, experiments are conducted, and the result demonstrates that our scheme can efficiently result in high location accuracy as well as orientation estimation. Currently the positioning accuracy and usability enhanced compared with similar solutions. Furthermore, our idea has a good application foreground, because the algorithms of data acquisition and pose estimation are compatible with the current state of data expansion.
Knowledge transfer from synthetic to real data has been widely studied to mitigate data annotation constraints in various computer vision tasks such as semantic segmentation. However, the study focused on 2D images and its counterpart in 3D point clouds segmentation lags far behind due to the lack of large-scale synthetic datasets and effective transfer methods. We address this issue by collecting SynLiDAR, a large-scale synthetic LiDAR dataset that contains point-wise annotated point clouds with accurate geometric shapes and comprehensive semantic classes. SynLiDAR was collected from multiple virtual environments with rich scenes and layouts which consists of over 19 billion points of 32 semantic classes. In addition, we design PCT, a novel point cloud translator that effectively mitigates the gap between synthetic and real point clouds. Specifically, we decompose the synthetic-to-real gap into an appearance component and a sparsity component and handle them separately which improves the point cloud translation greatly. We conducted extensive experiments over three transfer learning setups including data augmentation, semi-supervised domain adaptation and unsupervised domain adaptation. Extensive experiments show that SynLiDAR provides a high-quality data source for studying 3D transfer and the proposed PCT achieves superior point cloud translation consistently across the three setups. The dataset is available at https://github.com/xiaoaoran/SynLiDAR.
The demand for location-based services (LBS) in large indoor spaces, such as airports, shopping malls, museums and libraries, has been increasing in recent years. However, there is still no fully applicable solution for indoor positioning and navigation like Global Navigation Satellite System (GNSS) solutions in outdoor environments. Positioning in indoor scenes by using smartphone cameras has its own advantages: no additional needed infrastructure, low cost and a large potential market due to the popularity of smartphones, etc. However, existing methods or systems based on smartphone cameras and visual algorithms have their own limitations when implemented in relatively large indoor spaces. To deal with this problem, we designed an indoor positioning system to locate users in large indoor scenes. The system uses common static objects as references, e.g., doors and windows, to locate users. By using smartphone cameras, our proposed system is able to detect static objects in large indoor spaces and then calculate the smartphones’ position to locate users. The system integrates algorithms of deep learning and computer vision. Its cost is low because it does not require additional infrastructure. Experiments in an art museum with a complicated visual environment suggest that this method is able to achieve positioning accuracy within 1 m.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.