Joint estimation of pose, depth, and optical flow with a competition–cooperation transformer network

Liu, Xiaochen; Zhang, Tao; Liu, Mingming

doi:10.1016/j.neunet.2023.12.020

Neural Networks

2024

DOI: 10.1016/j.neunet.2023.12.020

|View full text |Cite

Joint estimation of pose, depth, and optical flow with a competition–cooperation transformer network

Xiaochen Liu,

Tao Zhang,

Mingming Liu

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

MS23D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer

Shao,

Tan,

Wang

et al. 2024

Neural Networks

View full text Add to dashboard Cite

MS23D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer

Shao,

Tan,

Wang

et al. 2024

Neural Networks

View full text Add to dashboard Cite

Advanced Monocular Outdoor Pose Estimation in Autonomous Systems: Leveraging Optical Flow, Depth Estimation, and Semantic Segmentation with Dynamic Object Removal

Ghasemieh,

Kashef

2024

Sensors

View full text Add to dashboard Cite

Autonomous technologies have revolutionized transportation, military operations, and space exploration, necessitating precise localization in environments where traditional GPS-based systems are unreliable or unavailable. While widespread for outdoor localization, GPS systems face limitations in obstructed environments such as dense urban areas, forests, and indoor spaces. Moreover, GPS reliance introduces vulnerabilities to signal disruptions, which can lead to significant operational failures. Hence, developing alternative localization techniques that do not depend on external signals is essential, showing a critical need for robust, GPS-independent localization solutions adaptable to different applications, ranging from Earth-based autonomous vehicles to robotic missions on Mars. This paper addresses these challenges using Visual odometry (VO) to estimate a camera’s pose by analyzing captured image sequences in GPS-denied areas tailored for autonomous vehicles (AVs), where safety and real-time decision-making are paramount. Extensive research has been dedicated to pose estimation using LiDAR or stereo cameras, which, despite their accuracy, are constrained by weight, cost, and complexity. In contrast, monocular vision is practical and cost-effective, making it a popular choice for drones, cars, and autonomous vehicles. However, robust and reliable monocular pose estimation models remain underexplored. This research aims to fill this gap by developing a novel adaptive framework for outdoor pose estimation and safe navigation using enhanced visual odometry systems with monocular cameras, especially for applications where deploying additional sensors is not feasible due to cost or physical constraints. This framework is designed to be adaptable across different vehicles and platforms, ensuring accurate and reliable pose estimation. We integrate advanced control theory to provide safety guarantees for motion control, ensuring that the AV can react safely to the imminent hazards and unknown trajectories of nearby traffic agents. The focus is on creating an AI-driven model(s) that meets the performance standards of multi-sensor systems while leveraging the inherent advantages of monocular vision. This research uses state-of-the-art machine learning techniques to advance visual odometry’s technical capabilities and ensure its adaptability across different platforms, cameras, and environments. By merging cutting-edge visual odometry techniques with robust control theory, our approach enhances both the safety and performance of AVs in complex traffic situations, directly addressing the challenge of safe and adaptive navigation. Experimental results on the KITTI odometry dataset demonstrate a significant improvement in pose estimation accuracy, offering a cost-effective and robust solution for real-world applications.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joint estimation of pose, depth, and optical flow with a competition–cooperation transformer network

Cited by 2 publications

References 37 publications

MS23D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer

MS23D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer

Advanced Monocular Outdoor Pose Estimation in Autonomous Systems: Leveraging Optical Flow, Depth Estimation, and Semantic Segmentation with Dynamic Object Removal

Contact Info

Product

Resources

About