Autonomous technologies have revolutionized transportation, military operations, and space exploration, necessitating precise localization in environments where traditional GPS-based systems are unreliable or unavailable. While widespread for outdoor localization, GPS systems face limitations in obstructed environments such as dense urban areas, forests, and indoor spaces. Moreover, GPS reliance introduces vulnerabilities to signal disruptions, which can lead to significant operational failures. Hence, developing alternative localization techniques that do not depend on external signals is essential, showing a critical need for robust, GPS-independent localization solutions adaptable to different applications, ranging from Earth-based autonomous vehicles to robotic missions on Mars. This paper addresses these challenges using Visual odometry (VO) to estimate a camera’s pose by analyzing captured image sequences in GPS-denied areas tailored for autonomous vehicles (AVs), where safety and real-time decision-making are paramount. Extensive research has been dedicated to pose estimation using LiDAR or stereo cameras, which, despite their accuracy, are constrained by weight, cost, and complexity. In contrast, monocular vision is practical and cost-effective, making it a popular choice for drones, cars, and autonomous vehicles. However, robust and reliable monocular pose estimation models remain underexplored. This research aims to fill this gap by developing a novel adaptive framework for outdoor pose estimation and safe navigation using enhanced visual odometry systems with monocular cameras, especially for applications where deploying additional sensors is not feasible due to cost or physical constraints. This framework is designed to be adaptable across different vehicles and platforms, ensuring accurate and reliable pose estimation. We integrate advanced control theory to provide safety guarantees for motion control, ensuring that the AV can react safely to the imminent hazards and unknown trajectories of nearby traffic agents. The focus is on creating an AI-driven model(s) that meets the performance standards of multi-sensor systems while leveraging the inherent advantages of monocular vision. This research uses state-of-the-art machine learning techniques to advance visual odometry’s technical capabilities and ensure its adaptability across different platforms, cameras, and environments. By merging cutting-edge visual odometry techniques with robust control theory, our approach enhances both the safety and performance of AVs in complex traffic situations, directly addressing the challenge of safe and adaptive navigation. Experimental results on the KITTI odometry dataset demonstrate a significant improvement in pose estimation accuracy, offering a cost-effective and robust solution for real-world applications.