Mobile robots and autonomous systems rely on advanced guidance modules which often incorporate cameras to enable key functionalities. These modules are equipped with visual odometry (VO) and visual simultaneous localization and mapping (VSLAM) algorithms that work by analyzing changes between successive frames captured by cameras. VO/VSLAM-based systems are critical backbones for autonomous vehicles, virtual reality, structure from motion, and other robotic operations. VO/VSLAM systems encounter difficulties when implementing real-time applications in outdoor environments with restricted hardware and software platforms. While many VO systems target achieving high accuracy and speed, they often exhibit high degree of complexity and limited robustness. To overcome these challenges, this paper aims to propose a new VO system called Stereo-RIVO that balances accuracy, speed, and computational cost. Furthermore, this algorithm is based on a new data association module which consists of two primary components: a scene-matching process that achieves exceptional precision without feature extraction and a key-frame detection technique based on a model of scene movement. The performance of this proposed VO system has been tested extensively for all sequences of KITTI and UTIAS datasets for analyzing efficiency for outdoor dynamic and indoor static environments, respectively. The results of these tests indicate that the proposed Stereo-RIVO outperforms other state-of-the-art methods in terms of robustness, accuracy, and speed. Our implementation code of stereo-RIVO is available at: https://github.com/salehierfan/Stereo-RIVO.