This research paper delves into the intricacies of Region-based Convolutional Neural Networks (R- CNN) and its variants, providing a meticulous comparative analysis. The study encompasses the evolution, functionality, and impact of these models in the realm of object detection within computer vision [3]. The objectives of this paper are to elucidate the significance of R-CNN and its variants, present an overview of object detection challenges, and underscore the crucial role these models play in addressing these challenges. The methodologies employed in this research involve an in-depth examination of the architectural components, such as Selective Search, feature extraction, object classification, and bounding box regression that constitute the R-CNN approach. By reviewing the historical context and subsequent developments, including Fast R-CNN, Faster R-CNN, and Mask R-CNN, the paper aims to shed light on the continuous evolution of these models. The findings of this study aim to contribute to the broader understanding of the advancements in object detection through deep learning, with potential applications in diverse fields, including autonomous driving and face recognition.