The expansion of autonomous driving operations requires the research and development of accurate and reliable self-localization approaches. These include visual odometry methods, in which accuracy is potentially superior to GNSS-based traditional techniques while also working in signal-denied areas. This paper presents an in-depth review of state-of-the-art methods in visual and point cloud odometry, along with a direct performance comparison of some of these techniques in the autonomous driving context. The evaluated methods include camera, LiDAR, and multi-modal approaches, featuring knowledge and learning-based algorithms. This set was subject to a series of tests on road driving public datasets, from which the performance of these techniques is benchmarked and quantitatively compared. Furthermore, we closely discuss their effectiveness against challenging conditions such as pronounced lighting variations, open spaces, and the presence of dynamic objects in the scene, grouped by categories. The research addresses and corroborates some of the most prominent limitations of state-of-the-art techniques for visual odometry based on 2D and 3D sensors and points out the stagnation, in terms of performance, of the most recent advances in this area, especially in complex environments. We also examine how multi-modal architectures can circumvent these weaknesses and how the current advances in AI constitute a way to overcome the current stagnation, indexing some opportunities for future research.